Maxwell Mkondiwa (m.mkondiwa@cgiar.org) and Jordan Chamberlin (j.chamberlin@cgiar.org)
1 Introduction
Amidst concerns of global agricultural productivity growth slowdown, there is an emerging consensus that crop productivity growth in some African countries has either been stagnant or slow in the past decade. This slowdown is attributed to degrading soil health and volatile weather patterns resulting in low partial productivity especially of nitrogen—hereafter nitrogen use efficiency (NUE) and falling profits. We contribute to this literature by using plot level nationally representative panel data (2008-2020) for Tanzania to examine if indeed NUEs and profits have been on a downward spiral. We combine a causal random forest model for heterogeneous treatment effects and a regional market economic surplus model to explore the economic implications of the NUE trends.
The integrated framework is presented in the figure below.
This notebook provides R code for Mkondiwa and Chamberlin (2004)(https://hdl.handle.net/10883/35304) on estimating the heterogenous treatment effects in Tanzania using non-parametric geoadditive and causal machine learning (ML) models.
# Lm List by yearbaseline_quad_lmList <-lmList(yld_~N_kgha+I(N_kgha^2)+P_kgha+I(P_kgha^2)+plotha+ncrops+hhmem+femhead+headeduc+arain_tot|year,tz_lsms_panel)coef(baseline_quad_lmList )
## with interactionswith rainfall# Lm List by yearbaseline_quad_lmList_inter <-lmList(yld_~N_kgha*arain_tot+I(N_kgha^2)*arain_tot+P_kgha+I(P_kgha^2)+plotha+ncrops+hhmem+femhead+headeduc|year,tz_lsms_panel)coef(baseline_quad_lmList_inter )
visreg2d(baseline_quad_inter_soils_2010, "N_kgha", "soc_5_15cm",ylab="Soil carbon (g per kg)",xlab="Nitrogen (kg per ha)", main="2010")
Code
visreg2d(baseline_quad_inter_soils_2012, "N_kgha", "soc_5_15cm",ylab="Soil carbon (g per kg)",xlab="Nitrogen (kg per ha)", main="2012")
Code
visreg2d(baseline_quad_inter_soils_2014, "N_kgha", "soc_5_15cm",ylab="Soil carbon (g per kg)",xlab="Nitrogen (kg per ha)", main="2014")
Code
visreg2d(baseline_quad_inter_soils_2020, "N_kgha", "soc_5_15cm",ylab="Soil carbon (g per kg)",xlab="Nitrogen (kg per ha)", main="2020")
Code
# Quadratic plateau or linear plateaulibrary(easynls)tz_lsms_panel.temp <- tz_lsms_panel[, c("yld_","N_kgha")]nls.QP <-nlsfit(tz_lsms_panel.temp, model =4)nls.QP$Model
[1] "y ~ (a + b * x + c * I(x^2)) * (x <= -0.5 * b/c) + (a + I(-b^2/(4 * c))) * (x > -0.5 * b/c)"
Code
mn=nls.QP$Parametersmn
N_kgha
coefficient a -3.909774e-01
coefficient b 1.052253e-02
coefficient c -1.090000e-06
p-value t.test for a 2.841850e-01
p-value t.test for b 0.000000e+00
p-value t.test for c 0.000000e+00
r-squared 8.650000e-02
adjusted r-squared 8.640000e-02
AIC 8.165610e+04
BIC 8.168471e+04
maximum or minimum value for y 2.499715e+01
critical point in x 4.825482e+03
Code
summary(mn)
N_kgha
Min. : -0.39
1st Qu.: 0.00
Median : 0.09
Mean :14015.95
3rd Qu.: 1225.12
Max. :81684.71
5.1.1 Site-year specific Quadratic Only response functions
The site-year specific Quadratic response function can be modeled as At level 1, we have \[ Y_{i} = a_{i} + b_{i}*N + c_{i}*N^2 + \varepsilon_{i} \] At level 2, we have \[
a_{i} \sim N(a_0, \sigma_{a}^2) \\
b_{i} \sim N(b_0, \sigma_{b}^2) \\
c_{i} \sim N(c_0, \sigma_{c}^2) \\
\] This model can be estimated with the linear mixed model function lmer in R package lme4
Linear mixed model fit by REML ['lmerMod']
Formula: yld_ ~ 1 + N_kgha + N2 + (1 | year) + (0 + N_kgha | year) + (0 +
N2 | year)
Data: tz_lsms_panel.temp
REML criterion at convergence: 152692.7
Random effects:
Groups Name Std.Dev.
year (Intercept) 8.001e+01
year.1 N_kgha 2.748e+00
year.2 N2 6.472e-03
Residual 7.960e+02
Number of obs: 9426, groups: year, 5
Fixed Effects:
(Intercept) N_kgha N2
748.29010 11.24225 0.01232
fit warnings:
Some predictor variables are on very different scales: consider rescaling
optimizer (nloptwrap) convergence code: 0 (OK) ; 0 optimizer warnings; 1 lme4 warnings
Code
summary(lmer.Q)
Linear mixed model fit by REML ['lmerMod']
Formula: yld_ ~ 1 + N_kgha + N2 + (1 | year) + (0 + N_kgha | year) + (0 +
N2 | year)
Data: tz_lsms_panel.temp
REML criterion at convergence: 152692.7
Scaled residuals:
Min 1Q Median 3Q Max
-2.3333 -0.6213 -0.3108 0.2772 6.6397
Random effects:
Groups Name Variance Std.Dev.
year (Intercept) 6.402e+03 8.001e+01
year.1 N_kgha 7.554e+00 2.748e+00
year.2 N2 4.188e-05 6.472e-03
Residual 6.337e+05 7.960e+02
Number of obs: 9426, groups: year, 5
Fixed effects:
Estimate Std. Error t value
(Intercept) 748.29010 36.88270 20.288
N_kgha 11.24225 1.75175 6.418
N2 0.01232 0.01525 0.808
Correlation of Fixed Effects:
(Intr) N_kgha
N_kgha -0.051
N2 0.046 -0.657
fit warnings:
Some predictor variables are on very different scales: consider rescaling
optimizer (nloptwrap) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')
Or, althernatively, can be estimated with the non-linear mixed model function nlme
Code
library(nlme)nlme.Q <-nlme(yld_ ~ (a + b * N_kgha + c * (N_kgha^2)),data = tz_lsms_panel,fixed = a + b + c ~1,random = a + b + c ~1,groups =~ year,start =c(800, 10, -0.001))nlme.Q
Nonlinear mixed-effects model fit by maximum likelihood
Model: yld_ ~ (a + b * N_kgha + c * (N_kgha^2))
Data: tz_lsms_panel
Log-likelihood: -76345.33
Fixed: a + b + c ~ 1
a b c
747.82829275 11.16211183 0.01323599
Random effects:
Formula: list(a ~ 1, b ~ 1, c ~ 1)
Level: year
Structure: General positive-definite, Log-Cholesky parametrization
StdDev Corr
a 57.45976820 a b
b 1.64304646 -0.022
c 0.02343941 0.946 -0.341
Residual 795.98608057
Number of Observations: 9426
Number of Groups: 5
5.1.2 Site-year specific Quadratic-plus-plateau response functions
The site-year specific response function can be modeled using a hierarchical quadratic-plus-plateau model. At level 1, we have \[ Y_{i} = \min(a_{i} + b_{i}*N + c_{i}*N^2, Y_{max}) + \varepsilon_{i} \] At level 2, we have \[
a_{i} \sim N(a_0, \sigma_{a}^2) \\
b_{i} \sim N(b_0, \sigma_{b}^2) \\
c_{i} \sim N(c_0, \sigma_{c}^2) \\
Y_{max} = a_{i} - b_{i}^2/(4*c_i)
\]
It seems the R function nlme would work to estimate this model
Code
# Define quadratic-plus-plateau functionmq <-lm(yld_ ~ N_kgha +I(N_kgha^2), data=tz_lsms_panel)a0 <-coef(mq)[[1]]b0 <-coef(mq)[[2]]c0 <-coef(mq)[[3]]clx0 <--0.5*b0/c0# Test nls# fx.QP <- function(N, a, b, c) {# y <- (a + b * N + c * I(N^2)) * (N <= -0.5 * b/c) + (a + I(-b^2/(4 * c))) * (N > -0.5 * b/c)# return(y)# }# # nls.QP <- nls(Y ~ fx.QP(N, a, b, c),# start = list(a = a0, b = b0, c = c0), data = dat.Puntel.CC.mean,# control = nls.control(maxiter = 6000))# quadplat <- function(x, a, b, clx) {# ifelse(x < clx, a + b * x + (-0.5*b/clx) * x * x, # a + b * clx + (-0.5*b/clx) * clx * clx)# }# # nls.QP <- nls(Y ~ quadplat(N, a, b, clx), # start = list(a = a0, b = b0, clx = clx0), data = dat.Puntel.CC.mean, # control = nls.control(maxiter = 6000))# nlme.QP <- nlme(Y ~ fx.QP(N, a, b, c),# data = dat.Puntel.CC.mean,# fixed = a + b + c ~ 1,# random = a + b + c ~ 1,# groups = ~ year,# start = c(a0, b0, c0))# # nlme.QP# tz_lsms_panel.nlme <- nlme(yld_ ~ (a + b * N_kgha + c * I(N_kgha ^2)) * (N_kgha <= (-0.5 * b/c)) + (a- I(b^2/(4 * c))) * (N_kgha > (-0.5 * b/c)),# data = tz_lsms_panel,# fixed = a + b + c ~ 1,# random = a + b + c ~ 1,# groups = ~ year,# start = c(a0, b0, c0))# tz_lsms_panel.nlme
Best linear fit using forest predictions (on held-out data)
as well as the mean forest prediction as regressors, along
with one-sided heteroskedasticity-robust (HC3) SEs:
Estimate Std. Error t value Pr(>t)
mean.forest.prediction 0.99724 0.10227 9.7509 < 2.2e-16 ***
differential.forest.prediction 1.25318 0.26830 4.6708 1.521e-06 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Best linear fit using forest predictions (on held-out data)
as well as the mean forest prediction as regressors, along
with one-sided heteroskedasticity-robust (HC3) SEs:
Estimate Std. Error t value Pr(>t)
mean.forest.prediction 0.99353 0.06987 14.2196 < 2.2e-16 ***
differential.forest.prediction 1.23712 0.26364 4.6925 1.369e-06 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Code
library(ggridges)library(dplyr)library(ggplot2)tau.multi_fert.forest_continuous <-predict(multi_fert.forest_continuous, target.sample ="all", estimate.variance =TRUE)tau.multi_fert.forest_continuous <-as.data.frame(tau.multi_fert.forest_continuous)tau.multi_fert.forest_X <-data.frame(tz_lsms_panel_estim_fert, tau.multi_fert.forest_continuous)NUE_Density=ggplot(tau.multi_fert.forest_X, aes(x = predictions, y ="", fill =factor(stat(quantile)))) +stat_density_ridges(geom ="density_ridges_gradient", calc_ecdf =TRUE,quantiles =4, quantile_lines =TRUE ) +scale_y_discrete(expand =c(0.01, 0)) +scale_fill_viridis_d(name ="Quartiles") +expand_limits(y =1) +theme_bw(base_size =16) +labs(x ="N use effect", y ="Density")NUE_Density
7 Exploring heterogeneity in Conditional N Use Efficiencies from CRF
7.1 Causal RF
Code
library(ggplot2)NperHa_CATE_N_plot <-ggplot(subset(tau.multi_fert.forest_X, tau.multi_fert.forest_X$N_kgha >0), aes(N_kgha, predictions)) +geom_smooth(method ="loess", formula = y ~ x, col ="darkblue") +labs(x ="Applied N per ha", y ="N use efficiency")previous_theme <-theme_set(theme_bw(base_size =16))NperHa_CATE_N_plot
Code
Plotsize_CATE_N_plot <-ggplot(subset(tau.multi_fert.forest_X, tau.multi_fert.forest_X$plotha >0), aes(plotha, predictions)) +geom_smooth(method ="loess", formula = y ~ x, col ="darkblue") +labs(x ="Plot size (ha)", y ="N use efficiency")previous_theme <-theme_set(theme_bw(base_size =16))Plotsize_CATE_N_plot
Code
P_CATE_N_plot <-ggplot(tau.multi_fert.forest_X, aes(P_kgha, predictions)) +geom_smooth(method ="loess", formula = y ~ x, col ="darkblue") +labs(x ="P", y ="N use efficiency")previous_theme <-theme_set(theme_bw(base_size =16))P_CATE_N_plot
Code
Hhmem_CATE_N_plot <-ggplot(tau.multi_fert.forest_X, aes(hhmem, predictions)) +geom_smooth(method ="loess", formula = y ~ x, col ="darkblue") +labs(x ="HHMEM", y ="N use efficiency")previous_theme <-theme_set(theme_bw(base_size =16))Hhmem_CATE_N_plot
Code
# By yeartau.multi_fert.forest_X$year_det[tau.multi_fert.forest_X$year==1]=2008tau.multi_fert.forest_X$year_det[tau.multi_fert.forest_X$year==2]=2010tau.multi_fert.forest_X$year_det[tau.multi_fert.forest_X$year==3]=2012tau.multi_fert.forest_X$year_det[tau.multi_fert.forest_X$year==4]=2014tau.multi_fert.forest_X$year_det[tau.multi_fert.forest_X$year==5]=2020tau.multi_fert.forest_X$year_det=as.numeric(tau.multi_fert.forest_X$year_det)NperHa_CATE_N_plot_yr <-ggplot(subset(tau.multi_fert.forest_X, tau.multi_fert.forest_X$N_kgha >0), aes(N_kgha, predictions)) +geom_smooth(method ="loess", formula = y ~ x, col ="darkblue") +lims(x =c(0, 100)) +labs(x ="Applied N per ha", y ="N use efficiency") +theme_bw(base_size =16) +facet_wrap(~year_det)NperHa_CATE_N_plot_yr
Code
# By soil organic carbonlibrary(ggplot2)soc_CATE_N_plot <-ggplot(subset(tau.multi_fert.forest_X, tau.multi_fert.forest_X$soc_5_15cm >0), aes(soc_5_15cm, predictions)) +geom_smooth(method ="loess", formula = y ~ x, col ="darkblue") +labs(x ="Soil organic carbon (g per kg)", y ="N use efficiency")previous_theme <-theme_set(theme_bw(base_size =16))soc_CATE_N_plot
Code
ggsave("figures/soc_CATE_N_plot.png", dpi=300, width=4.16,height=4.16)# By soil sandlibrary(ggplot2)sand_CATE_N_plot <-ggplot(subset(tau.multi_fert.forest_X, tau.multi_fert.forest_X$sand_0_5cm >0), aes(sand_0_5cm, predictions)) +geom_smooth(method ="loess", formula = y ~ x, col ="darkblue") +labs(x ="Sand (%)", y ="N use efficiency")previous_theme <-theme_set(theme_bw(base_size =16))sand_CATE_N_plot
Code
ggsave("figures/sand_CATE_N_plot.png", dpi=300, width=4.16,height=4.16)# Electrical conductivitylibrary(ggplot2)soil_elec_cond_CATE_N_plot <-ggplot(subset(tau.multi_fert.forest_X, tau.multi_fert.forest_X$ECN_5_15cm >0), aes(ECN_5_15cm, predictions)) +geom_smooth(method ="loess", formula = y ~ x, col ="darkblue") +labs(x ="Soil electrical conductivity", y ="N use efficiency")previous_theme <-theme_set(theme_bw(base_size =16))soil_elec_cond_CATE_N_plot
Code
ggsave("figures/soil_elec_cond_CATE_N_plot.png", dpi=300, width=4.16,height=4.16)# pHlibrary(ggplot2)soil_pH_CATE_N_plot <-ggplot(subset(tau.multi_fert.forest_X, tau.multi_fert.forest_X$pH_0_5cm >0), aes(pH_0_5cm, predictions)) +geom_smooth(method ="loess", formula = y ~ x, col ="darkblue") +labs(x ="Soil pH", y ="N use efficiency")previous_theme <-theme_set(theme_bw(base_size =16))soil_pH_CATE_N_plot
Code
ggsave("figures/soil_pH_CATE_N_plot.png", dpi=300, width=4.16,height=4.16)# soil nitrogenlibrary(ggplot2)soil_nitrogen_CATE_N_plot <-ggplot(subset(tau.multi_fert.forest_X, tau.multi_fert.forest_X$nitrogen_0_5cm >0), aes(nitrogen_0_5cm, predictions)) +geom_smooth(method ="loess", formula = y ~ x, col ="darkblue") +labs(x ="Soil nitrogen (g per kg)", y ="N use efficiency")previous_theme <-theme_set(theme_bw(base_size =16))soil_nitrogen_CATE_N_plot
Code
ggsave("figures/soil_nitrogen_CATE_N_plot.png", dpi=300, width=4.16,height=4.16)# By population densitylibrary(ggplot2)pop_density_CATE_N_plot <-ggplot(subset(tau.multi_fert.forest_X, tau.multi_fert.forest_X$population_density >0), aes(population_density, predictions)) +geom_smooth(method ="loess", formula = y ~ x, col ="darkblue") +labs(x ="Population density", y ="N use efficiency")previous_theme <-theme_set(theme_bw(base_size =16))pop_density_CATE_N_plot
Code
ggsave("figures/pop_density_CATE_N_plot.png", dpi=300, width=4.16,height=4.16)# By elevationlibrary(ggplot2)elev_CATE_N_plot <-ggplot(subset(tau.multi_fert.forest_X, tau.multi_fert.forest_X$wc2.1_30s_elev >0), aes(wc2.1_30s_elev, predictions)) +geom_smooth(method ="loess", formula = y ~ x, col ="darkblue") +labs(x ="Elevation (masl)", y ="N use efficiency")previous_theme <-theme_set(theme_bw(base_size =16))elev_CATE_N_plot
Code
ggsave("figures/elev_CATE_N_plot.png", dpi=300, width=4.16,height=4.16)# By rainfalllibrary(ggplot2)rainfall_CATE_N_plot <-ggplot(subset(tau.multi_fert.forest_X, tau.multi_fert.forest_X$arain_tot >0), aes(arain_tot, predictions)) +geom_smooth(method ="loess", formula = y ~ x, col ="darkblue") +labs(x ="Rainfall (mm)", y ="N use efficiency")+facet_wrap(~year_det)previous_theme <-theme_set(theme_bw(base_size =16))rainfall_CATE_N_plot
8 Are N use efficiencies falling overtime?
8.1 Causal RF
Code
tau.multi_fert.forest_X$year_det[tau.multi_fert.forest_X$year==1]=2008tau.multi_fert.forest_X$year_det[tau.multi_fert.forest_X$year==2]=2010tau.multi_fert.forest_X$year_det[tau.multi_fert.forest_X$year==3]=2012tau.multi_fert.forest_X$year_det[tau.multi_fert.forest_X$year==4]=2014tau.multi_fert.forest_X$year_det[tau.multi_fert.forest_X$year==5]=2020tau.multi_fert.forest_X$year_det=as.numeric(tau.multi_fert.forest_X$year_det)Year_CATE_N_plot=ggplot(tau.multi_fert.forest_X,aes(year_det,predictions))+geom_smooth(method="loess",formula=y~x,col="darkblue")+labs(x="Year",y="N use efficiency")+scale_x_continuous(breaks =c(2008, 2010, 2012,2014,2020))previous_theme <-theme_set(theme_bw())Year_CATE_N_plot
library(ggridges)library(dplyr)library(ggplot2)tau.multi_fert.forest_dummy <-predict(multi_fert.forest_binary, target.sample ="all", estimate.variance =TRUE)tau.multi_fert.forest_dummy <-as.data.frame(tau.multi_fert.forest_dummy)tau.multi_fert.forest_X_dummy <-data.frame(tz_lsms_panel_estim_fert, tau.multi_fert.forest_dummy)ggplot(tau.multi_fert.forest_X_dummy, aes(x = predictions, y ="", fill =factor(stat(quantile)))) +stat_density_ridges(geom ="density_ridges_gradient", calc_ecdf =TRUE,quantiles =4, quantile_lines =TRUE ) +scale_y_discrete(expand =c(0.01, 0)) +scale_fill_viridis_d(name ="Quartiles") +expand_limits(y =1) +theme_bw(base_size =16) +labs(x ="Inorgatic fert use effect", y ="Density")
Code
#library(sp)library(sf)tau.multi_fert.forest_X_dummy_sp <-SpatialPointsDataFrame(cbind(tau.multi_fert.forest_X_dummy$lon_modified, tau.multi_fert.forest_X_dummy$lat_modified), data = tau.multi_fert.forest_X_dummy, proj4string =CRS("+proj=longlat +datum=WGS84"))library(tmap)tmap_mode("plot")Nuse_effect_map <-tm_shape(tau.multi_fert.forest_X_dummy_sp) +tm_dots(col ="predictions", title ="Effect of N use on yield (kg/ha)", style ="quantile") +tm_layout(legend.outside =TRUE)Nuse_effect_map
Code
tmap_save(Nuse_effect_map, "figures/Nuse_effect_map.png", width =600, height =600, asp =0)# N Use efficiency maplibrary(sp)library(sf)tau.multi_fert.forest_X_sp <-SpatialPointsDataFrame(cbind(tau.multi_fert.forest_X$lon_modified, tau.multi_fert.forest_X$lat_modified), data = tau.multi_fert.forest_X, proj4string =CRS("+proj=longlat +datum=WGS84"))library(tmap)tmap_mode("plot")Nuse_efficiency_map <-tm_shape(tau.multi_fert.forest_X_sp) +tm_dots(col ="predictions", title ="NUE (kg Maize/Kg N)", style ="quantile") +tm_layout(legend.outside =TRUE)Nuse_efficiency_map
Code
tmap_save(Nuse_efficiency_map, "figures/Nuse_efficiency_map.png", width =600, height =600, asp =0)# N Use maplibrary(sp)library(sf)tau.multi_fert.forest_X_sp_small <-subset(tau.multi_fert.forest_X_sp, tau.multi_fert.forest_X_sp$N_kgha >0)library(tmap)tmap_mode("plot")Nuse_map <-tm_shape(tau.multi_fert.forest_X_sp_small) +tm_dots(col ="N_kgha", title ="N applied (Kg/ha)", style ="quantile") +tm_layout(legend.outside =TRUE)Nuse_map
Code
tmap_save(Nuse_map, "figures/Nuse_map.png", width =600, height =600, asp =0)
tau.multi_fert.forest_X_sp$region_name <-over(tau.multi_fert.forest_X_sp, TZ_sp[,"NAME_1"])# Summary by region and by yearlibrary(modelsummary)tau.multi_fert.forest_X_sp_dt <-tau.multi_fert.forest_X_sp@datatau.multi_fert.forest_X_sp_dt$region_name_final=tau.multi_fert.forest_X_sp_dt$region_name$NAME_1NUE_cont <-datasummary(Heading("region_name")*region_name_final~Heading("N_obs") * N +Heading("percent") *Percent()+ predictions*(Mean+SD),data = tau.multi_fert.forest_X_sp_dt, output="data.frame")NUE_cont$N_obs=as.numeric(NUE_cont$N_obs)NUE_cont$Mean=as.numeric(NUE_cont$Mean)NUE_cont$SD=as.numeric(NUE_cont$SD)library(reactable)library(htmltools)library(fontawesome)htmltools::browsable(tagList( tags$button(tagList(fontawesome::fa("download"), "Download as CSV"),onclick ="Reactable.downloadDataCSV('NUE_cont', 'NUE_cont.csv')" ),reactable( NUE_cont,searchable =TRUE,defaultPageSize =38,elementId ="NUE_cont" ) ))
Code
# Remove regions with fewer observations than 50NUE_cont=subset(NUE_cont,NUE_cont$N_obs>50)NUE_cont_sf=merge(TZ_sf,NUE_cont, by.x="NAME_1",by.y="region_name")# Map the library(tmap)tmap_mode("plot")tm_shape(NUE_cont_sf) +tm_polygons(col ="Mean", title ="Average NUE", style ="quantile") +tm_layout(legend.outside =TRUE)
pred <-SpatialPoints(elevationglobal_geodata_aoi)@coordspred <-as.data.frame(pred)names(pred)[1:2] <-c("lon", "lat")pred <- pred %>%drop_na()library(sp)pred_sp <-SpatialPointsDataFrame(cbind(pred$lon, pred$lat), data = pred, proj4string =CRS("+proj=longlat +datum=WGS84"))yield_gain_hat <-predict(b, newdata = pred)yield_gain_hat <-as.data.frame(yield_gain_hat)pred_yield_gain_hat <-cbind(pred, yield_gain_hat )pred_yield_gain_hat$sigma <-NULLlibrary(terra)yield_gain_ras <-rast(pred_yield_gain_hat, type ="xyz")plot(yield_gain_ras )
Code
library(raster)yield_gain_ras2 <-raster(yield_gain_ras)library(sf)TZ_sf_dis_sp <-as_Spatial(TZ_sf_dis)yield_gain_ras2 <-mask(yield_gain_ras2, TZ_sf_dis_sp)plot(yield_gain_ras2, main ="Yield gains to N")
Alston, J.M., Norton, G.W., and Pardey, P. 1995. “Science under scarcity: Principles and practice for agricultural research evaluation and priority setting”. CAB International. See Chapter 7 (pp. 463-498).
Athey, S., Tibshirani, J., and Wager, S. 2019. “Generalized Random Forests”. Annals of Statistics 47 (2): 1148-1178. Doi: 10.1214/18-AOS1709.
Source Code
---title: "Tanzania fertilizer use efficiency, profitability and risks across time"format: html: code-fold: true code-tools: truefig-dpi: 300fig-width: 8.88fig-align: centerfig-height: 5self-contained: trueauthor: Maxwell Mkondiwa (m.mkondiwa@cgiar.org) and Jordan Chamberlin (j.chamberlin@cgiar.org)editor: visualtoc: truetoc-location: leftnumber-sections: trueexecute: message: false warning: false echo: true---# IntroductionAmidstconcerns of global agricultural productivity growth slowdown, there is anemerging consensus that crop productivity growth in some African countries haseither been stagnant or slow in the past decade. This slowdown is attributed todegrading soil health and volatile weather patterns resulting in low partialproductivity especially of nitrogen—hereafter nitrogen use efficiency (NUE) andfalling profits. We contribute to this literature by using plot levelnationally representative panel data (2008-2020) for Tanzania to examine ifindeed NUEs and profits have been on a downward spiral. We combine a causalrandom forest model for heterogeneous treatment effects and a regional marketeconomic surplus model to explore the economic implications of the NUE trends.The integrated framework is presented in the figure below.This notebook provides R code for Mkondiwa and Chamberlin (2004)(<https://hdl.handle.net/10883/35304>) on estimating the heterogenous treatment effects in Tanzania using non-parametric geoadditive and causal machine learning (ML) models.# Exploration```{r}# package namespackages <-c("gplots", "modelsummary", "grf", "policytree", "ggplot2", "micEcon", "frontier", "dplyr", "tidyr", "knitr", "car", "RColorBrewer", "DT", "rio", "tidyr", "dsfa", "mgcv", "geodata", "sf", "mapview", "dplyr", "terra", "raster", "ggridges", "rio", "BART", "BART", "BayesTree", "bartCause", "plm", "rlearner")# install packages# installed_packages <- packages %in% rownames(installed.packages())# if (any(installed_packages == FALSE)) {# install.packages(packages[!installed_packages])# }# load packagesinvisible(lapply(packages, library, character.only =TRUE))# install.packages("collapse", repos = "https://fastverse.r-universe.dev")library(rio)#tz_lsms_panel <- import("tz_lsms_panel_plot_level_221117.dta")tz_lsms_panel <-import("tz_lsms_panel.csv")tz_lsms_panel$yld_[tz_lsms_panel$yld_ >quantile(tz_lsms_panel$yld_, 0.99, na.rm =TRUE)] <-NAtz_lsms_panel$N_kgha[tz_lsms_panel$N_kgha >quantile(tz_lsms_panel$N_kgha, 0.99, na.rm =TRUE)] <-NAtz_lsms_panel$plotha[tz_lsms_panel$plotha >quantile(tz_lsms_panel$plotha, 0.99, na.rm =TRUE)] <-NAtz_lsms_panel$N_kgha_dum <-0tz_lsms_panel$N_kgha_dum[tz_lsms_panel$N_kgha >0] <-1tz_lsms_panel$N_kgha_dum <-as.numeric(tz_lsms_panel$N_kgha_dum)tz_lsms_panel$N_kgha_cond <- tz_lsms_panel$N_kghatz_lsms_panel$N_kgha_cond[tz_lsms_panel$N_kgha_dum ==0] <-NAtz_lsms_panel$N_kgha_cond <-as.numeric(tz_lsms_panel$N_kgha_cond)tz_lsms_panel=subset(tz_lsms_panel, !(is.na(tz_lsms_panel$yld_)))tz_lsms_panel=subset(tz_lsms_panel, !(is.na(tz_lsms_panel$N_kgha)))tz_lsms_panel=subset(tz_lsms_panel, !(is.na(tz_lsms_panel$plotha)))tz_lsms_panel=subset(tz_lsms_panel, !(is.na(tz_lsms_panel$P_kgha)))tz_lsms_panel=subset(tz_lsms_panel, !(is.na(tz_lsms_panel$harv_yld_mai)))tz_lsms_panel$soc_5_15cm=tz_lsms_panel$`soc_5-15cm`tz_lsms_panel$nitrogen_0_5cm=tz_lsms_panel$`nitrogen_0-5cm`tz_lsms_panel$sand_0_5cm=tz_lsms_panel$`sand_0-5cm`tz_lsms_panel$ECN_5_15cm=tz_lsms_panel$`ECN_5-15cm`tz_lsms_panel$pH_0_5cm=tz_lsms_panel$`pH_0-5cm````# Descriptives table```{r}library(modelsummary)mean_na <-function(x) mean(x, na.rm =TRUE)sd_na <-function(x) SD(x, na.rm =TRUE)summary_table_by_year <-datasummary(Heading("N obs") * N +Heading("%") *Percent() + (yld_ + N_kgha_dum +N_kgha_cond+ N_kgha + P_kgha+plotha + ncrops + hhmem + femhead + headeduc + arain_tot+arain_cv) * (mean_na + sd_na) ~Factor(year), data = tz_lsms_panel, output ="data.frame")summary_table_by_year_mean <-datasummary(Heading("N obs") * N +Heading("%") *Percent() + (yld_ + N_kgha_dum +N_kgha_cond+ N_kgha + P_kgha+plotha + ncrops + hhmem + femhead + headeduc + arain_tot+arain_cv) * (mean_na) ~Factor(year), data = tz_lsms_panel, output ="data.frame")library(reactable)library(htmltools)library(fontawesome)htmltools::browsable(tagList( tags$button(tagList(fontawesome::fa("download"), "Download as CSV"),onclick ="Reactable.downloadDataCSV('summary_table_by_year', 'summary_table_by_year.csv')" ),reactable( summary_table_by_year,searchable =TRUE,defaultPageSize =38,elementId ="summary_table_by_year" ) ))library(reactable)library(htmltools)library(fontawesome)htmltools::browsable(tagList( tags$button(tagList(fontawesome::fa("download"), "Download as CSV"),onclick ="Reactable.downloadDataCSV('summary_table_by_year_mean', 'summary_table_by_year_mean.csv')" ),reactable( summary_table_by_year_mean,searchable =TRUE,defaultPageSize =38,elementId ="summary_table_by_year_mean" ) ))# # By district and by year# summary_table_by_year_dist <- datasummary(Heading("N obs") * N + Heading("%") * Percent() + (yld_ + N_kgha_g_0 + N_kgha + plotha + P_kgha + ncrops + hhmem + femhead + headeduc + arain_tot) * (mean_na + sd_na) ~ Factor(year*district), data = tz_lsms_panel, output = "data.frame")# # library(reactable)# library(htmltools)# library(fontawesome)# # htmltools::browsable(# tagList(# tags$button(# tagList(fontawesome::fa("download"), "Download as CSV"),# onclick = "Reactable.downloadDataCSV('summary_table_by_year_dist', 'summary_table_by_year_dist.csv')"# ),# reactable(# summary_table_by_year,# searchable = TRUE,# defaultPageSize = 38,# elementId = "summary_table_by_year_dist"# )# )# )tz_lsms_panel$region=as.factor(tz_lsms_panel$region)tz_lsms_panel$year=as.factor(tz_lsms_panel$year)# By district and by yearsummary_table_by_year_region <-datasummary(Heading("N obs") * N +Heading("%") *Percent() + (yld_ + fert1_bin+N_kgha_dum +N_kgha_cond+ N_kgha + P_kgha+plotha+ ncrops + hhmem + femhead + headeduc + arain_tot)*(mean_na) ~ year*region, data = tz_lsms_panel, output ="data.frame")summary_table_by_year_region_mean <-datasummary(Heading("N obs") * N +Heading("%") *Percent() + (yld_ +fert1_bin+ N_kgha_dum +N_kgha_cond+ N_kgha + P_kgha+plotha + ncrops + hhmem + femhead + headeduc + arain_tot)*(mean_na) ~ year*region, data = tz_lsms_panel, output ="data.frame")summary_table_by_year_region_t=t(summary_table_by_year_region)summary_table_by_year_region_mean_t=t(summary_table_by_year_region_mean)library(reactable)library(htmltools)library(fontawesome)htmltools::browsable(tagList( tags$button(tagList(fontawesome::fa("download"), "Download as CSV"),onclick ="Reactable.downloadDataCSV('summary_table_by_year_region_t', 'summary_table_by_year_region_t.csv')" ),reactable( summary_table_by_year_region_t,searchable =TRUE,defaultPageSize =38,elementId ="summary_table_by_year_region_t" ) ))library(reactable)library(htmltools)library(fontawesome)htmltools::browsable(tagList( tags$button(tagList(fontawesome::fa("download"), "Download as CSV"),onclick ="Reactable.downloadDataCSV('summary_table_by_year_region_mean_t', 'summary_table_by_year_region_mean_t.csv')" ),reactable( summary_table_by_year_region_mean_t,searchable =TRUE,defaultPageSize =38,elementId ="summary_table_by_year_region_mean_t" ) ))```# Graphics```{r}library(gplots)plotmeans(yld_ ~ year, main="Yield heterogeineity across years",xlab="Year", ylab="Maize yield (kg/ha)", data=tz_lsms_panel)library(gplots)plotmeans(N_kgha ~ year, main="N per ha heterogeineity across years", data=tz_lsms_panel,xlab="Year", ylab="Average N per ha" )plotmeans(arain_tot~ year, main="Rainfall across years", data=tz_lsms_panel,xlab="Year", ylab="Average total rainfall (mm)" )library(easynls)library(lme4)library(data.table)library(ggplot2)# summary_table_by_year_mean_t=t(summary_table_by_year_mean)# # ggplot(summary_table_by_year_mean) +# geom_line(aes(x=N, y=Y, group=year, color=year))# # ggplot(summary_table_by_year_region) +# geom_line(aes(x=N, y=Y, group=region, color=region))```# Conventional Production Function Approach## Linear and non-linear parameter models: e.g., Quadratic```{r}library(data.table)library(ggplot2)library(easynls)library(lme4)library(nlme)tz_lsms_panel=data.table(tz_lsms_panel)tz_lsms_panel$year=as.factor(tz_lsms_panel$year)#tz_lsms_panel_sf_subset=subset(tz_lsms_panel_sf, #select=c("V1","yld_","harv_yld_mai","N_kgha","P_kgha","year","soc_5-15cm","population_density","wc2.1_30s_elev","#sand_0-5cm","nitrogen_0-5cm","ECN_5-15cm","pH_0-5cm")) #library(tidyr)#tz_lsms_panel_sf_subset=tz_lsms_panel_sf_subset %>% drop_na()# Linearbaseline_ols=lm(yld_~N_kgha+plotha+P_kgha+ncrops+hhmem+femhead+headeduc+arain_tot+region+year,data=tz_lsms_panel)summary(baseline_ols)# Lm List by yearbaseline_ols_yearList <-lmList(yld_~N_kgha+plotha+P_kgha+ncrops+hhmem+femhead+headeduc+arain_tot|year,tz_lsms_panel)baseline_ols_yearList_est <-coef(baseline_ols_yearList)baseline_ols_yearList_est <-t(as.data.frame(baseline_ols_yearList_est))order_tab <-rep(1,9)order_tab <-as.data.frame(order_tab)baseline_ols_yearList_est_dt=cbind(baseline_ols_yearList_est,order_tab)baseline_ols_yearList_ci <-confint(baseline_ols_yearList)baseline_ols_yearList_ci_dt <-t(as.data.frame(baseline_ols_yearList_ci))order_tab <-rep(2:3,9)order_tab <-as.data.frame(order_tab)baseline_ols_yearList_ci_dt=cbind(baseline_ols_yearList_ci_dt,order_tab)baseline_ols_yearList_est_ci_dt=bind_rows(baseline_ols_yearList_est_dt,baseline_ols_yearList_ci_dt)write.csv(baseline_ols_yearList_est_ci_dt,"tables/baseline_ols_yearList_est_ci_dt.csv")library(reactable)library(htmltools)library(fontawesome)htmltools::browsable(tagList( tags$button(tagList(fontawesome::fa("download"), "Download as CSV"),onclick ="Reactable.downloadDataCSV('baseline_ols_yearList_est_ci_dt', 'baseline_ols_yearList_est_ci_dt.csv')" ),reactable( baseline_ols_yearList_est_ci_dt,searchable =TRUE,defaultPageSize =38,elementId ="baseline_ols_yearList_est_ci_dt" ) ))(ci <-confint(baseline_ols_yearList))plot(ci,cex=2, cex.lab=3, xlab="Maize yield (kg/ha) response", ylab="Year")# Visreglibrary(visreg)tz_lsms_panel_2008 <-subset(tz_lsms_panel,tz_lsms_panel$year=="2008")tz_lsms_panel_2010 <-subset(tz_lsms_panel,tz_lsms_panel$year=="2010")tz_lsms_panel_2012 <-subset(tz_lsms_panel,tz_lsms_panel$year=="2012")tz_lsms_panel_2014 <-subset(tz_lsms_panel,tz_lsms_panel$year=="2014")tz_lsms_panel_2020 <-subset(tz_lsms_panel,tz_lsms_panel$year=="2020")baseline_ols_2008 <-lm(yld_~N_kgha+plotha+P_kgha+ncrops+hhmem+femhead+headeduc+arain_tot,data=tz_lsms_panel_2008)baseline_ols_2010 <-lm(yld_~N_kgha+plotha+P_kgha+ncrops+hhmem+femhead+headeduc+arain_tot,data=tz_lsms_panel_2010)baseline_ols_2012 <-lm(yld_~N_kgha+plotha+P_kgha+ncrops+hhmem+femhead+headeduc+arain_tot,data=tz_lsms_panel_2012)baseline_ols_2014 <-lm(yld_~N_kgha+plotha+P_kgha+ncrops+hhmem+femhead+headeduc+arain_tot,data=tz_lsms_panel_2014)baseline_ols_2020 <-lm(yld_~N_kgha+plotha+P_kgha+ncrops+hhmem+femhead+headeduc+arain_tot,data=tz_lsms_panel_2020)baseline_ols_2008_v <-visreg(baseline_ols_2008, "N_kgha", plot=FALSE)baseline_ols_2010_v <-visreg(baseline_ols_2010, "N_kgha", plot=FALSE)baseline_ols_2012_v <-visreg(baseline_ols_2012, "N_kgha", plot=FALSE)baseline_ols_2014_v <-visreg(baseline_ols_2014, "N_kgha", plot=FALSE)baseline_ols_2020_v <-visreg(baseline_ols_2020, "N_kgha", plot=FALSE)baseline_ols_all_plot <-visregList(baseline_ols_2008_v,baseline_ols_2010_v,baseline_ols_2012_v,baseline_ols_2014_v,baseline_ols_2020_v, collapse=TRUE,labels=c("2008","2010","2012","2014","2020"))op <-par(mfrow=c(1,5))par(op)plot(baseline_ols_all_plot,ylab="Maize yield (kg per ha)",xlab="Applied nitrogen (kg per ha)", ylim=c(0,3000), main="Linear",partial=FALSE,rug=FALSE)## LINEAR PLATEAU# library(easynls)# library(data.table)# tz_lsms_panel <- data.table(tz_lsms_panel)# tz_lsms_panel.temp <- tz_lsms_panel[, .("yld_","N_kgha")]# nls.LP <- nlsfit(tz_lsms_panel.temp, model = 3)# nls.LP$Model# lp_model=nls.LP$Parameters# lp_model# summary(lp_model)# QUADRATICbaseline_quad=lm(yld_~N_kgha+I(N_kgha^2)+P_kgha+I(P_kgha^2)+plotha+ncrops+hhmem+femhead+headeduc+arain_tot+region+year,data=tz_lsms_panel)summary(baseline_quad)library(modelsummary)modelplot(baseline_quad)# Lm List by yearbaseline_quad_lmList <-lmList(yld_~N_kgha+I(N_kgha^2)+P_kgha+I(P_kgha^2)+plotha+ncrops+hhmem+femhead+headeduc+arain_tot|year,tz_lsms_panel)coef(baseline_quad_lmList )(ci <-confint(baseline_quad_lmList ))plot(ci,cex=2, cex.lab=3, xlab="Maize yield (kg/ha) response", ylab="Year")baseline_quad_lmList_est <-coef(baseline_quad_lmList)baseline_quad_lmList_est <-t(as.data.frame(baseline_quad_lmList_est))order_tab <-rep(1,11)order_tab <-as.data.frame(order_tab)baseline_quad_lmList_est_dt=cbind(baseline_quad_lmList_est,order_tab)baseline_quad_lmList_ci <-confint(baseline_quad_lmList)baseline_quad_lmList_ci_dt <-t(as.data.frame(baseline_quad_lmList_ci))order_tab <-rep(2:3,11)order_tab <-as.data.frame(order_tab)baseline_quad_lmList_ci_dt=cbind(baseline_quad_lmList_ci_dt,order_tab)baseline_quad_lmList_est_ci_dt=bind_rows(baseline_quad_lmList_est_dt,baseline_quad_lmList_ci_dt)write.csv(baseline_quad_lmList_est_ci_dt,"tables/baseline_quad_lmList_est_ci_dt.csv")## Visreglibrary(visreg)baseline_quad_2008 <-lm(yld_~N_kgha+I(N_kgha^2)+P_kgha+I(P_kgha^2)+plotha+ncrops+hhmem+femhead+headeduc+arain_tot,data=tz_lsms_panel_2008)baseline_quad_2010 <-lm(yld_~N_kgha+I(N_kgha^2)+P_kgha+I(P_kgha^2)+plotha+ncrops+hhmem+femhead+headeduc+arain_tot,data=tz_lsms_panel_2010)baseline_quad_2012 <-lm(yld_~N_kgha+I(N_kgha^2)+P_kgha+I(P_kgha^2)+plotha+ncrops+hhmem+femhead+headeduc+arain_tot,data=tz_lsms_panel_2012)baseline_quad_2014 <-lm(yld_~N_kgha+I(N_kgha^2)+P_kgha+I(P_kgha^2)+plotha+ncrops+hhmem+femhead+headeduc+arain_tot,data=tz_lsms_panel_2014)baseline_quad_2020 <-lm(yld_~N_kgha+I(N_kgha^2)+P_kgha+I(P_kgha^2)+plotha+ncrops+hhmem+femhead+headeduc+arain_tot,data=tz_lsms_panel_2020)baseline_quad_2008_v <-visreg(baseline_quad_2008, "N_kgha", plot=FALSE)baseline_quad_2010_v <-visreg(baseline_quad_2010, "N_kgha", plot=FALSE)baseline_quad_2012_v <-visreg(baseline_quad_2012, "N_kgha", plot=FALSE)baseline_quad_2014_v <-visreg(baseline_quad_2014, "N_kgha", plot=FALSE)baseline_quad_2020_v <-visreg(baseline_quad_2020, "N_kgha", plot=FALSE)baseline_quad_all_plot <-visregList(baseline_quad_2008_v,baseline_quad_2010_v,baseline_quad_2012_v,baseline_quad_2014_v,baseline_quad_2020_v, collapse=TRUE,labels=c("2008","2010","2012","2014","2020"))op <-par(mfrow=c(1,5))par(op)plot(baseline_quad_all_plot,ylab="Maize yield (kg per ha)",xlab="Applied nitrogen (kg per ha)", ylim=c(0,3000), main="Quadratic",partial=FALSE,rug=FALSE)## with interactionswith rainfall# Lm List by yearbaseline_quad_lmList_inter <-lmList(yld_~N_kgha*arain_tot+I(N_kgha^2)*arain_tot+P_kgha+I(P_kgha^2)+plotha+ncrops+hhmem+femhead+headeduc|year,tz_lsms_panel)coef(baseline_quad_lmList_inter )(ci <-confint(baseline_quad_lmList_inter))plot(ci,cex=2, cex.lab=3, xlab="Maize yield (kg/ha) response", ylab="Year")baseline_quad_lmList_inter_est <-coef(baseline_quad_lmList_inter)baseline_quad_lmList_inter_est <-t(as.data.frame(baseline_quad_lmList_inter_est))order_tab <-rep(1,13)order_tab <-as.data.frame(order_tab)baseline_quad_lmList_inter_est_dt=cbind(baseline_quad_lmList_inter_est,order_tab)baseline_quad_lmList_inter_ci <-confint(baseline_quad_lmList_inter)baseline_quad_lmList_inter_ci_dt <-t(as.data.frame(baseline_quad_lmList_inter_ci))order_tab <-rep(2:3,13)order_tab <-as.data.frame(order_tab)baseline_quad_lmList_inter_ci_dt=cbind(baseline_quad_lmList_inter_ci_dt,order_tab)baseline_quad_lmList_inter_est_ci_dt=bind_rows(baseline_quad_lmList_inter_est_dt,baseline_quad_lmList_inter_ci_dt)write.csv(baseline_quad_lmList_inter_est_ci_dt,"tables/baseline_quad_lmList_inter_est_ci_dt.csv")library(reactable)library(htmltools)library(fontawesome)htmltools::browsable(tagList( tags$button(tagList(fontawesome::fa("download"), "Download as CSV"),onclick ="Reactable.downloadDataCSV('baseline_quad_lmList_inter_est_ci_dt', 'baseline_quad_lmList_inter_est_ci_dt.csv')" ),reactable( baseline_quad_lmList_inter_est_ci_dt,searchable =TRUE,defaultPageSize =38,elementId ="baseline_quad_lmList_inter_est_ci_dt" ) ))## Visreglibrary(visreg)baseline_quad_inter_2008 <-lm(yld_~N_kgha*arain_tot+I(N_kgha^2)*arain_tot+P_kgha+I(P_kgha^2)+plotha+ncrops+hhmem+femhead+headeduc,data=tz_lsms_panel_2008)baseline_quad_inter_2010 <-lm(yld_~N_kgha*arain_tot+I(N_kgha^2)*arain_tot+P_kgha+I(P_kgha^2)+plotha+ncrops+hhmem+femhead+headeduc,data=tz_lsms_panel_2010)baseline_quad_inter_2012 <-lm(yld_~N_kgha*arain_tot+I(N_kgha^2)*arain_tot+P_kgha+I(P_kgha^2)+plotha+ncrops+hhmem+femhead+headeduc,data=tz_lsms_panel_2012)baseline_quad_inter_2014 <-lm(yld_~N_kgha*arain_tot+I(N_kgha^2)*arain_tot+P_kgha+I(P_kgha^2)+plotha+ncrops+hhmem+femhead+headeduc,data=tz_lsms_panel_2014)baseline_quad_inter_2020 <-lm(yld_~N_kgha*arain_tot+I(N_kgha^2)*arain_tot+P_kgha+I(P_kgha^2)+plotha+ncrops+hhmem+femhead+headeduc,data=tz_lsms_panel_2020)visreg2d(baseline_quad_inter_2008, "N_kgha", "arain_tot",ylab="Rainfall (mm)",xlab="Nitrogen (kg per ha)", main="2008")visreg2d(baseline_quad_inter_2010, "N_kgha", "arain_tot",ylab="Rainfall (mm)",xlab="Nitrogen (kg per ha)", main="2010")visreg2d(baseline_quad_inter_2012, "N_kgha", "arain_tot",ylab="Rainfall (mm)",xlab="Nitrogen (kg per ha)", main="2012")visreg2d(baseline_quad_inter_2014, "N_kgha", "arain_tot",ylab="Rainfall (mm)",xlab="Nitrogen (kg per ha)", main="2014")visreg2d(baseline_quad_inter_2020, "N_kgha", "arain_tot",ylab="Rainfall (mm)",xlab="Nitrogen (kg per ha)", main="2020")# Soil properties Interacted with soils baseline_quad_lmList_inter_soil <-lmList(yld_~N_kgha*soc_5_15cm+I(N_kgha^2)+P_kgha+I(P_kgha^2)+plotha+ncrops+hhmem+femhead+headeduc+arain_tot+population_density+wc2.1_30s_elev+sand_0_5cm+nitrogen_0_5cm+soc_5_15cm+ ECN_5_15cm+pH_0_5cm|year,tz_lsms_panel)coef(baseline_quad_lmList_inter_soil )(ci <-confint(baseline_quad_lmList_inter_soil))plot(ci,cex=2, cex.lab=3, xlab="Maize yield (kg/ha) response", ylab="Year")baseline_quad_lmList_inter_soil_est <-coef(baseline_quad_lmList_inter_soil)baseline_quad_lmList_inter_soil_est <-t(as.data.frame(baseline_quad_lmList_inter_soil_est))order_tab <-rep(1,19)order_tab <-as.data.frame(order_tab)baseline_quad_lmList_inter_soil_est_dt=cbind(baseline_quad_lmList_inter_soil_est,order_tab)baseline_quad_lmList_inter_soil_ci <-confint(baseline_quad_lmList_inter_soil)baseline_quad_lmList_inter_soil_ci_dt <-t(as.data.frame(baseline_quad_lmList_inter_soil_ci))order_tab <-rep(2:3,19)order_tab <-as.data.frame(order_tab)baseline_quad_lmList_inter_soil_ci_dt=cbind(baseline_quad_lmList_inter_soil_ci_dt,order_tab)baseline_quad_lmList_inter_soil_est_ci_dt=bind_rows(baseline_quad_lmList_inter_soil_est_dt,baseline_quad_lmList_inter_soil_ci_dt)write.csv(baseline_quad_lmList_inter_soil_est_ci_dt,"tables/baseline_quad_lmList_inter_soil_est_ci_dt.csv")library(reactable)library(htmltools)library(fontawesome)htmltools::browsable(tagList( tags$button(tagList(fontawesome::fa("download"), "Download as CSV"),onclick ="Reactable.downloadDataCSV('baseline_quad_lmList_inter_soil_est_ci_dt', 'baseline_quad_lmList_inter_soil_est_ci_dt.csv')" ),reactable( baseline_quad_lmList_inter_soil_est_ci_dt,searchable =TRUE,defaultPageSize =38,elementId ="baseline_quad_lmList_inter_soil_est_ci_dt" ) ))## Visreglibrary(visreg)baseline_quad_inter_soils_2008 <-lm(yld_~N_kgha*soc_5_15cm+I(N_kgha^2)+P_kgha+I(P_kgha^2)+plotha+ncrops+hhmem+femhead+headeduc+arain_tot+population_density+wc2.1_30s_elev+sand_0_5cm+nitrogen_0_5cm+soc_5_15cm+ ECN_5_15cm+pH_0_5cm,data=tz_lsms_panel_2008)baseline_quad_inter_soils_2010 <-lm(yld_~N_kgha*soc_5_15cm+I(N_kgha^2)+P_kgha+I(P_kgha^2)+plotha+ncrops+hhmem+femhead+headeduc+arain_tot+population_density+wc2.1_30s_elev+sand_0_5cm+nitrogen_0_5cm+soc_5_15cm+ ECN_5_15cm+pH_0_5cm,data=tz_lsms_panel_2010)baseline_quad_inter_soils_2012 <-lm(yld_~N_kgha*soc_5_15cm+I(N_kgha^2)+P_kgha+I(P_kgha^2)+plotha+ncrops+hhmem+femhead+headeduc+arain_tot+population_density+wc2.1_30s_elev+sand_0_5cm+nitrogen_0_5cm+soc_5_15cm+ ECN_5_15cm+pH_0_5cm,data=tz_lsms_panel_2012)baseline_quad_inter_soils_2014 <-lm(yld_~N_kgha*soc_5_15cm+I(N_kgha^2)+P_kgha+I(P_kgha^2)+plotha+ncrops+hhmem+femhead+headeduc+arain_tot+population_density+wc2.1_30s_elev+sand_0_5cm+nitrogen_0_5cm+soc_5_15cm+ ECN_5_15cm+pH_0_5cm,data=tz_lsms_panel_2014)baseline_quad_inter_soils_2020 <-lm(yld_~N_kgha*soc_5_15cm+I(N_kgha^2)+P_kgha+I(P_kgha^2)+plotha+ncrops+hhmem+femhead+headeduc+arain_tot+population_density+wc2.1_30s_elev+sand_0_5cm+nitrogen_0_5cm+soc_5_15cm+ ECN_5_15cm+pH_0_5cm,data=tz_lsms_panel_2020)visreg2d(baseline_quad_inter_soils_2008, "N_kgha", "soc_5_15cm",ylab="Soil carbon (g per kg)",xlab="Nitrogen (kg per ha)", main="2008")visreg2d(baseline_quad_inter_soils_2010, "N_kgha", "soc_5_15cm",ylab="Soil carbon (g per kg)",xlab="Nitrogen (kg per ha)", main="2010")visreg2d(baseline_quad_inter_soils_2012, "N_kgha", "soc_5_15cm",ylab="Soil carbon (g per kg)",xlab="Nitrogen (kg per ha)", main="2012")visreg2d(baseline_quad_inter_soils_2014, "N_kgha", "soc_5_15cm",ylab="Soil carbon (g per kg)",xlab="Nitrogen (kg per ha)", main="2014")visreg2d(baseline_quad_inter_soils_2020, "N_kgha", "soc_5_15cm",ylab="Soil carbon (g per kg)",xlab="Nitrogen (kg per ha)", main="2020")# Quadratic plateau or linear plateaulibrary(easynls)tz_lsms_panel.temp <- tz_lsms_panel[, c("yld_","N_kgha")]nls.QP <-nlsfit(tz_lsms_panel.temp, model =4)nls.QP$Modelmn=nls.QP$Parametersmnsummary(mn)```### Site-year specific Quadratic Only response functionsThe site-year specific Quadratic response function can be modeled as At level 1, we have $$ Y_{i} = a_{i} + b_{i}*N + c_{i}*N^2 + \varepsilon_{i} $$ At level 2, we have $$ a_{i} \sim N(a_0, \sigma_{a}^2) \\b_{i} \sim N(b_0, \sigma_{b}^2) \\c_{i} \sim N(c_0, \sigma_{c}^2) \\$$ This model can be estimated with the linear mixed model function `lmer` in R package `lme4````{r}tz_lsms_panel.temp <- tz_lsms_panel[, .(yld_, N_kgha, N2 = N_kgha^2, year)]lmer.Q <-lmer(yld_ ~1+ N_kgha + N2 + (1| year) + (0+ N_kgha | year) + (0+ N2 | year), data = tz_lsms_panel.temp)lmer.Qsummary(lmer.Q)```Or, althernatively, can be estimated with the non-linear mixed model function `nlme````{r}library(nlme)nlme.Q <-nlme(yld_ ~ (a + b * N_kgha + c * (N_kgha^2)),data = tz_lsms_panel,fixed = a + b + c ~1,random = a + b + c ~1,groups =~ year,start =c(800, 10, -0.001))nlme.Q```### Site-year specific Quadratic-plus-plateau response functionsThe site-year specific response function can be modeled using a hierarchical quadratic-plus-plateau model. At level 1, we have $$ Y_{i} = \min(a_{i} + b_{i}*N + c_{i}*N^2, Y_{max}) + \varepsilon_{i} $$ At level 2, we have $$ a_{i} \sim N(a_0, \sigma_{a}^2) \\b_{i} \sim N(b_0, \sigma_{b}^2) \\c_{i} \sim N(c_0, \sigma_{c}^2) \\Y_{max} = a_{i} - b_{i}^2/(4*c_i)$$It seems the R function `nlme` would work to estimate this model```{r}# Define quadratic-plus-plateau functionmq <-lm(yld_ ~ N_kgha +I(N_kgha^2), data=tz_lsms_panel)a0 <-coef(mq)[[1]]b0 <-coef(mq)[[2]]c0 <-coef(mq)[[3]]clx0 <--0.5*b0/c0# Test nls# fx.QP <- function(N, a, b, c) {# y <- (a + b * N + c * I(N^2)) * (N <= -0.5 * b/c) + (a + I(-b^2/(4 * c))) * (N > -0.5 * b/c)# return(y)# }# # nls.QP <- nls(Y ~ fx.QP(N, a, b, c),# start = list(a = a0, b = b0, c = c0), data = dat.Puntel.CC.mean,# control = nls.control(maxiter = 6000))# quadplat <- function(x, a, b, clx) {# ifelse(x < clx, a + b * x + (-0.5*b/clx) * x * x, # a + b * clx + (-0.5*b/clx) * clx * clx)# }# # nls.QP <- nls(Y ~ quadplat(N, a, b, clx), # start = list(a = a0, b = b0, clx = clx0), data = dat.Puntel.CC.mean, # control = nls.control(maxiter = 6000))# nlme.QP <- nlme(Y ~ fx.QP(N, a, b, c),# data = dat.Puntel.CC.mean,# fixed = a + b + c ~ 1,# random = a + b + c ~ 1,# groups = ~ year,# start = c(a0, b0, c0))# # nlme.QP# tz_lsms_panel.nlme <- nlme(yld_ ~ (a + b * N_kgha + c * I(N_kgha ^2)) * (N_kgha <= (-0.5 * b/c)) + (a- I(b^2/(4 * c))) * (N_kgha > (-0.5 * b/c)),# data = tz_lsms_panel,# fixed = a + b + c ~ 1,# random = a + b + c ~ 1,# groups = ~ year,# start = c(a0, b0, c0))# tz_lsms_panel.nlme```### Bayesian analysis```{r}# library(brms)# # tz_lsms_panel$Nsq_kgha=tz_lsms_panel$N_kgha^2# tz_lsms_panel$Y=tz_lsms_panel$yld_# # f1 <- Y ~ (a+ b*N_kgha+c*(Nsq_kgha))*(N_kgha<=(-0.5*b/c))+(a-(b*b/(4*c)))*(N_kgha>(-0.5*b/c))# # prior_1=c(set_prior("normal(5,1)", nlpar="a"),# set_prior("normal(0,1)", nlpar="b"),# set_prior("normal(0,1)", nlpar="c"))# # form=bf(f1,nl=TRUE)+list(a~1|year,b~1|year,c~1|year)# # bayesQP=brm(form,prior=prior_1,data=tz_lsms_panel)# # summary(bayesQP) ```## Nonlinear parameter models: geoadditive model```{r}set.seed(321)library(bamlss)# Linearf1 <- yld_~N_kgha + plotha + P_kgha + ncrops + hhmem + femhead + headeduc + arain_tot +population_density+wc2.1_30s_elev+sand_0_5cm+nitrogen_0_5cm+soc_5_15cm+ ECN_5_15cm+pH_0_5cm+ region + yearb1 <-bamlss(f1, data = tz_lsms_panel, family ="gaussian", n.iter =12000, burnin =2000, thin =10)b1 <-gam(f1, data = tz_lsms_panel, family ="gaussian", n.iter =12000, burnin =2000, thin =10)summary(b1)library(visreg)visreg(b1,"N_kgha")# Nonlinear# f2=yld_~s(N_kgha)+s(plotha)+s(P_kgha)+s(ncrops)+s(hhmem)+femhead+headeduc+s(arain_tot)+ti(region)+ti(year)## b2=bamlss(f2,data=tz_lsms_panel,family="gaussian", n.iter=12000, burnin=2000, thin=10)### summary(b2)# plot(b2, ask=FALSE)# #library(bamlss)f3 <-list( yld_ ~s(N_kgha, by = year) + plotha + P_kgha + ncrops + hhmem + femhead + headeduc + arain_tot +population_density+wc2.1_30s_elev+sand_0_5cm+nitrogen_0_5cm+soc_5_15cm+ ECN_5_15cm+pH_0_5cm, sigma ~s(N_kgha, by = year) + plotha + P_kgha + ncrops + hhmem + femhead + headeduc + arain_tot+population_density+wc2.1_30s_elev+sand_0_5cm+nitrogen_0_5cm+soc_5_15cm+ ECN_5_15cm+pH_0_5cm)b3 <-bamlss(f3, data = tz_lsms_panel)summary(b3)# plot(b3,ask=FALSE)plot(b3, cex =2.5, model ="mu", term ="s(N_kgha,by=year)")plot(b3, cex =2.5, model ="sigma", term ="s(N_kgha,by=year")# GAM modelf_gam <- yld_ ~s(N_kgha, by = year,sp=0.1) + plotha + P_kgha + ncrops + hhmem + femhead + headeduc + arain_tot +population_density+wc2.1_30s_elev+sand_0_5cm+nitrogen_0_5cm+soc_5_15cm+ ECN_5_15cm+pH_0_5cmb_gam <-gam(f_gam,data = tz_lsms_panel)previous_theme <-theme_set(theme_bw())visreg_polynomial_plot=visreg(b_gam,"N_kgha","year",gg=TRUE,xlab="Applied nitrogen (kg per ha)", ylab="Maize yield (kg per ha)",partial=FALSE,rug=FALSE)visreg_polynomial_plotggsave("figures/visreg_polynomial_plot.png",width=4.16,height=3.16)dev.off()visreg2d(b_gam,"year", "N_kgha", plot.type="image")```# Causal RF approach## Binary treatment```{r}library(grf)library(policytree)tz_lsms_panel_estim_fert <-subset(tz_lsms_panel, select =c("fert1_bin","yld_", "N_kgha", "plotha", "P_kgha", "ncrops", "hhmem", "femhead", "headeduc", "arain_tot", "population_density", "wc2.1_30s_elev", "sand_0_5cm", "nitrogen_0_5cm", "soc_5_15cm", "ECN_5_15cm", "pH_0_5cm", "region", "year", "lat_modified", "lon_modified"))tz_lsms_panel_estim_fert$region <-as.numeric(tz_lsms_panel_estim_fert$region)tz_lsms_panel_estim_fert$year <-as.numeric(tz_lsms_panel_estim_fert$year)library(tidyr)tz_lsms_panel_estim_fert <- tz_lsms_panel_estim_fert %>%drop_na()Y_cf_fert <-as.vector(tz_lsms_panel_estim_fert$yld_)## Causal random forest -----------------X_cf_fert <-subset(tz_lsms_panel_estim_fert, select =c("plotha", "ncrops", "hhmem", "femhead", "headeduc", "arain_tot", "population_density", "wc2.1_30s_elev", "sand_0_5cm", "nitrogen_0_5cm", "soc_5_15cm", "ECN_5_15cm", "pH_0_5cm", "region", "year"))X_firststage_cf_fert <-subset(tz_lsms_panel_estim_fert, select =c("plotha", "ncrops", "hhmem", "femhead", "headeduc", "arain_tot", "population_density", "wc2.1_30s_elev", "sand_0_5cm", "nitrogen_0_5cm", "soc_5_15cm", "ECN_5_15cm", "pH_0_5cm", "region", "year"))W_cf_fert_binary <-as.vector(tz_lsms_panel_estim_fert$fert1_bin)# Probability random forest to create weightsW.multi_fert.forest_binary <-regression_forest(X_cf_fert, W_cf_fert_binary,equalize.cluster.weights =FALSE,seed =2)W.hat.multi.all_fert_binary <-predict(W.multi_fert.forest_binary, estimate.variance =TRUE)$predictions# Regression forest to get expected responsesY.multi_fert.forest_binary <-regression_forest(X_cf_fert, Y_cf_fert,equalize.cluster.weights =FALSE,seed =2)print(Y.multi_fert.forest_binary)varimp.multi_fert_binary <-variable_importance(Y.multi_fert.forest_binary)Y.hat.multi.all_fert_binary <-predict(Y.multi_fert.forest_binary, estimate.variance =TRUE)$predictions# Fit binary causal RF modelmulti_fert.forest_binary <-causal_forest(X = X_cf_fert, Y = Y_cf_fert, W = W_cf_fert_binary, W.hat = W.hat.multi.all_fert_binary, Y.hat = Y.hat.multi.all_fert_binary, seed =2)varimp.multi_fert_cf_binary <-variable_importance(multi_fert.forest_binary)# Average treatment effectsmulti_fert_ate_binary <-average_treatment_effect(multi_fert.forest_binary, target.sample ="overlap")multi_fert_ate_binarymulti_fert_binary_calibration <-test_calibration(multi_fert.forest_binary)multi_fert_binary_calibration```## Continuous treatment```{r}library(grf)library(policytree)tz_lsms_panel_estim_fert <-subset(tz_lsms_panel, select =c("fert1_bin","yld_", "N_kgha", "plotha", "P_kgha", "ncrops", "hhmem", "femhead", "headeduc", "arain_tot", "population_density", "wc2.1_30s_elev", "sand_0_5cm", "nitrogen_0_5cm", "soc_5_15cm", "ECN_5_15cm", "pH_0_5cm", "region", "year", "lat_modified", "lon_modified","maipkg_usd","Npkg_usd"))tz_lsms_panel_estim_fert$region <-as.numeric(tz_lsms_panel_estim_fert$region)tz_lsms_panel_estim_fert$year <-as.numeric(tz_lsms_panel_estim_fert$year)library(tidyr)tz_lsms_panel_estim_fert <- tz_lsms_panel_estim_fert %>%drop_na()Y_cf_fert <-as.vector(tz_lsms_panel_estim_fert$yld_)## Causal random forest -----------------X_cf_fert <-subset(tz_lsms_panel_estim_fert, select =c("plotha", "ncrops", "hhmem", "femhead", "headeduc", "arain_tot", "population_density", "wc2.1_30s_elev", "sand_0_5cm", "nitrogen_0_5cm", "soc_5_15cm", "ECN_5_15cm", "pH_0_5cm", "region", "year"))X_firststage_cf_fert <-subset(tz_lsms_panel_estim_fert, select =c("plotha", "ncrops", "hhmem", "femhead", "headeduc", "arain_tot", "population_density", "wc2.1_30s_elev", "sand_0_5cm", "nitrogen_0_5cm", "soc_5_15cm", "ECN_5_15cm", "pH_0_5cm", "region", "year"))W_cf_fert_continuous <-as.vector(tz_lsms_panel_estim_fert$N_kgha)# Probability random forest to create weightsW.multi_fert.forest_continuous <-regression_forest(X_cf_fert, W_cf_fert_continuous,equalize.cluster.weights =FALSE,seed =2)W.hat.multi.all_fert_continuous <-predict(W.multi_fert.forest_continuous, estimate.variance =TRUE)$predictions# Regression forest to get expected responsesY.multi_fert.forest_continuous <-regression_forest(X_cf_fert, Y_cf_fert,equalize.cluster.weights =FALSE,seed =2)print(Y.multi_fert.forest_continuous)varimp.multi_fert_continuous <-variable_importance(Y.multi_fert.forest_continuous)Y.hat.multi.all_fert_continuous <-predict(Y.multi_fert.forest_continuous, estimate.variance =TRUE)$predictions# Fit continuous causal RF modelmulti_fert.forest_continuous <-causal_forest(X = X_cf_fert, Y = Y_cf_fert, W = W_cf_fert_continuous, W.hat = W.hat.multi.all_fert_continuous, Y.hat = Y.hat.multi.all_fert_continuous, seed =2)varimp.multi_fert_cf_continuous <-variable_importance(multi_fert.forest_continuous)# Average treatment effectsmulti_fert_ate_continuous <-average_treatment_effect(multi_fert.forest_continuous, target.sample ="overlap")multi_fert_ate_continuousmulti_fert_continuous_calibration <-test_calibration(multi_fert.forest_continuous)multi_fert_continuous_calibrationlibrary(ggridges)library(dplyr)library(ggplot2)tau.multi_fert.forest_continuous <-predict(multi_fert.forest_continuous, target.sample ="all", estimate.variance =TRUE)tau.multi_fert.forest_continuous <-as.data.frame(tau.multi_fert.forest_continuous)tau.multi_fert.forest_X <-data.frame(tz_lsms_panel_estim_fert, tau.multi_fert.forest_continuous)NUE_Density=ggplot(tau.multi_fert.forest_X, aes(x = predictions, y ="", fill =factor(stat(quantile)))) +stat_density_ridges(geom ="density_ridges_gradient", calc_ecdf =TRUE,quantiles =4, quantile_lines =TRUE ) +scale_y_discrete(expand =c(0.01, 0)) +scale_fill_viridis_d(name ="Quartiles") +expand_limits(y =1) +theme_bw(base_size =16) +labs(x ="N use effect", y ="Density")NUE_Densityggsave("figures/NUE_Density.png", dpi=300, width=5.16,height=4.16)```# Exploring heterogeneity in Conditional N Use Efficiencies from CRF## Causal RF```{r}library(ggplot2)NperHa_CATE_N_plot <-ggplot(subset(tau.multi_fert.forest_X, tau.multi_fert.forest_X$N_kgha >0), aes(N_kgha, predictions)) +geom_smooth(method ="loess", formula = y ~ x, col ="darkblue") +labs(x ="Applied N per ha", y ="N use efficiency")previous_theme <-theme_set(theme_bw(base_size =16))NperHa_CATE_N_plotPlotsize_CATE_N_plot <-ggplot(subset(tau.multi_fert.forest_X, tau.multi_fert.forest_X$plotha >0), aes(plotha, predictions)) +geom_smooth(method ="loess", formula = y ~ x, col ="darkblue") +labs(x ="Plot size (ha)", y ="N use efficiency")previous_theme <-theme_set(theme_bw(base_size =16))Plotsize_CATE_N_plotP_CATE_N_plot <-ggplot(tau.multi_fert.forest_X, aes(P_kgha, predictions)) +geom_smooth(method ="loess", formula = y ~ x, col ="darkblue") +labs(x ="P", y ="N use efficiency")previous_theme <-theme_set(theme_bw(base_size =16))P_CATE_N_plotHhmem_CATE_N_plot <-ggplot(tau.multi_fert.forest_X, aes(hhmem, predictions)) +geom_smooth(method ="loess", formula = y ~ x, col ="darkblue") +labs(x ="HHMEM", y ="N use efficiency")previous_theme <-theme_set(theme_bw(base_size =16))Hhmem_CATE_N_plot# By yeartau.multi_fert.forest_X$year_det[tau.multi_fert.forest_X$year==1]=2008tau.multi_fert.forest_X$year_det[tau.multi_fert.forest_X$year==2]=2010tau.multi_fert.forest_X$year_det[tau.multi_fert.forest_X$year==3]=2012tau.multi_fert.forest_X$year_det[tau.multi_fert.forest_X$year==4]=2014tau.multi_fert.forest_X$year_det[tau.multi_fert.forest_X$year==5]=2020tau.multi_fert.forest_X$year_det=as.numeric(tau.multi_fert.forest_X$year_det)NperHa_CATE_N_plot_yr <-ggplot(subset(tau.multi_fert.forest_X, tau.multi_fert.forest_X$N_kgha >0), aes(N_kgha, predictions)) +geom_smooth(method ="loess", formula = y ~ x, col ="darkblue") +lims(x =c(0, 100)) +labs(x ="Applied N per ha", y ="N use efficiency") +theme_bw(base_size =16) +facet_wrap(~year_det)NperHa_CATE_N_plot_yr# By soil organic carbonlibrary(ggplot2)soc_CATE_N_plot <-ggplot(subset(tau.multi_fert.forest_X, tau.multi_fert.forest_X$soc_5_15cm >0), aes(soc_5_15cm, predictions)) +geom_smooth(method ="loess", formula = y ~ x, col ="darkblue") +labs(x ="Soil organic carbon (g per kg)", y ="N use efficiency")previous_theme <-theme_set(theme_bw(base_size =16))soc_CATE_N_plotggsave("figures/soc_CATE_N_plot.png", dpi=300, width=4.16,height=4.16)# By soil sandlibrary(ggplot2)sand_CATE_N_plot <-ggplot(subset(tau.multi_fert.forest_X, tau.multi_fert.forest_X$sand_0_5cm >0), aes(sand_0_5cm, predictions)) +geom_smooth(method ="loess", formula = y ~ x, col ="darkblue") +labs(x ="Sand (%)", y ="N use efficiency")previous_theme <-theme_set(theme_bw(base_size =16))sand_CATE_N_plotggsave("figures/sand_CATE_N_plot.png", dpi=300, width=4.16,height=4.16)# Electrical conductivitylibrary(ggplot2)soil_elec_cond_CATE_N_plot <-ggplot(subset(tau.multi_fert.forest_X, tau.multi_fert.forest_X$ECN_5_15cm >0), aes(ECN_5_15cm, predictions)) +geom_smooth(method ="loess", formula = y ~ x, col ="darkblue") +labs(x ="Soil electrical conductivity", y ="N use efficiency")previous_theme <-theme_set(theme_bw(base_size =16))soil_elec_cond_CATE_N_plotggsave("figures/soil_elec_cond_CATE_N_plot.png", dpi=300, width=4.16,height=4.16)# pHlibrary(ggplot2)soil_pH_CATE_N_plot <-ggplot(subset(tau.multi_fert.forest_X, tau.multi_fert.forest_X$pH_0_5cm >0), aes(pH_0_5cm, predictions)) +geom_smooth(method ="loess", formula = y ~ x, col ="darkblue") +labs(x ="Soil pH", y ="N use efficiency")previous_theme <-theme_set(theme_bw(base_size =16))soil_pH_CATE_N_plotggsave("figures/soil_pH_CATE_N_plot.png", dpi=300, width=4.16,height=4.16)# soil nitrogenlibrary(ggplot2)soil_nitrogen_CATE_N_plot <-ggplot(subset(tau.multi_fert.forest_X, tau.multi_fert.forest_X$nitrogen_0_5cm >0), aes(nitrogen_0_5cm, predictions)) +geom_smooth(method ="loess", formula = y ~ x, col ="darkblue") +labs(x ="Soil nitrogen (g per kg)", y ="N use efficiency")previous_theme <-theme_set(theme_bw(base_size =16))soil_nitrogen_CATE_N_plotggsave("figures/soil_nitrogen_CATE_N_plot.png", dpi=300, width=4.16,height=4.16)# By population densitylibrary(ggplot2)pop_density_CATE_N_plot <-ggplot(subset(tau.multi_fert.forest_X, tau.multi_fert.forest_X$population_density >0), aes(population_density, predictions)) +geom_smooth(method ="loess", formula = y ~ x, col ="darkblue") +labs(x ="Population density", y ="N use efficiency")previous_theme <-theme_set(theme_bw(base_size =16))pop_density_CATE_N_plotggsave("figures/pop_density_CATE_N_plot.png", dpi=300, width=4.16,height=4.16)# By elevationlibrary(ggplot2)elev_CATE_N_plot <-ggplot(subset(tau.multi_fert.forest_X, tau.multi_fert.forest_X$wc2.1_30s_elev >0), aes(wc2.1_30s_elev, predictions)) +geom_smooth(method ="loess", formula = y ~ x, col ="darkblue") +labs(x ="Elevation (masl)", y ="N use efficiency")previous_theme <-theme_set(theme_bw(base_size =16))elev_CATE_N_plotggsave("figures/elev_CATE_N_plot.png", dpi=300, width=4.16,height=4.16)# By rainfalllibrary(ggplot2)rainfall_CATE_N_plot <-ggplot(subset(tau.multi_fert.forest_X, tau.multi_fert.forest_X$arain_tot >0), aes(arain_tot, predictions)) +geom_smooth(method ="loess", formula = y ~ x, col ="darkblue") +labs(x ="Rainfall (mm)", y ="N use efficiency")+facet_wrap(~year_det)previous_theme <-theme_set(theme_bw(base_size =16))rainfall_CATE_N_plot```# Are N use efficiencies falling overtime?## Causal RF```{r}tau.multi_fert.forest_X$year_det[tau.multi_fert.forest_X$year==1]=2008tau.multi_fert.forest_X$year_det[tau.multi_fert.forest_X$year==2]=2010tau.multi_fert.forest_X$year_det[tau.multi_fert.forest_X$year==3]=2012tau.multi_fert.forest_X$year_det[tau.multi_fert.forest_X$year==4]=2014tau.multi_fert.forest_X$year_det[tau.multi_fert.forest_X$year==5]=2020tau.multi_fert.forest_X$year_det=as.numeric(tau.multi_fert.forest_X$year_det)Year_CATE_N_plot=ggplot(tau.multi_fert.forest_X,aes(year_det,predictions))+geom_smooth(method="loess",formula=y~x,col="darkblue")+labs(x="Year",y="N use efficiency")+scale_x_continuous(breaks =c(2008, 2010, 2012,2014,2020))previous_theme <-theme_set(theme_bw())Year_CATE_N_plotggsave("figures/Year_CATE_N_plot.png", dpi=300, width=5.16,height=4.16)library(ggpubr) library(tidyverse) #Sowing dates Year_CATE_N_Errorplot= tau.multi_fert.forest_X %>%drop_na(year_det) %>%ggerrorplot(x ="year_det", y ="predictions",add ="mean", error.plot ="errorbar", color="black",size=1, ggtheme=theme_bw())+labs(x="Year",y="N use efficiency")+theme_bw(base_size =16)Year_CATE_N_Errorplotggsave("figures/Year_CATE_N_Errorplot.png", dpi=300, width=5.16,height=4.16)```# Double machine learning (DML)```{r}```# Mapping the estimates```{r}library(ggridges)library(dplyr)library(ggplot2)tau.multi_fert.forest_dummy <-predict(multi_fert.forest_binary, target.sample ="all", estimate.variance =TRUE)tau.multi_fert.forest_dummy <-as.data.frame(tau.multi_fert.forest_dummy)tau.multi_fert.forest_X_dummy <-data.frame(tz_lsms_panel_estim_fert, tau.multi_fert.forest_dummy)ggplot(tau.multi_fert.forest_X_dummy, aes(x = predictions, y ="", fill =factor(stat(quantile)))) +stat_density_ridges(geom ="density_ridges_gradient", calc_ecdf =TRUE,quantiles =4, quantile_lines =TRUE ) +scale_y_discrete(expand =c(0.01, 0)) +scale_fill_viridis_d(name ="Quartiles") +expand_limits(y =1) +theme_bw(base_size =16) +labs(x ="Inorgatic fert use effect", y ="Density")#library(sp)library(sf)tau.multi_fert.forest_X_dummy_sp <-SpatialPointsDataFrame(cbind(tau.multi_fert.forest_X_dummy$lon_modified, tau.multi_fert.forest_X_dummy$lat_modified), data = tau.multi_fert.forest_X_dummy, proj4string =CRS("+proj=longlat +datum=WGS84"))library(tmap)tmap_mode("plot")Nuse_effect_map <-tm_shape(tau.multi_fert.forest_X_dummy_sp) +tm_dots(col ="predictions", title ="Effect of N use on yield (kg/ha)", style ="quantile") +tm_layout(legend.outside =TRUE)Nuse_effect_maptmap_save(Nuse_effect_map, "figures/Nuse_effect_map.png", width =600, height =600, asp =0)# N Use efficiency maplibrary(sp)library(sf)tau.multi_fert.forest_X_sp <-SpatialPointsDataFrame(cbind(tau.multi_fert.forest_X$lon_modified, tau.multi_fert.forest_X$lat_modified), data = tau.multi_fert.forest_X, proj4string =CRS("+proj=longlat +datum=WGS84"))library(tmap)tmap_mode("plot")Nuse_efficiency_map <-tm_shape(tau.multi_fert.forest_X_sp) +tm_dots(col ="predictions", title ="NUE (kg Maize/Kg N)", style ="quantile") +tm_layout(legend.outside =TRUE)Nuse_efficiency_maptmap_save(Nuse_efficiency_map, "figures/Nuse_efficiency_map.png", width =600, height =600, asp =0)# N Use maplibrary(sp)library(sf)tau.multi_fert.forest_X_sp_small <-subset(tau.multi_fert.forest_X_sp, tau.multi_fert.forest_X_sp$N_kgha >0)library(tmap)tmap_mode("plot")Nuse_map <-tm_shape(tau.multi_fert.forest_X_sp_small) +tm_dots(col ="N_kgha", title ="N applied (Kg/ha)", style ="quantile") +tm_layout(legend.outside =TRUE)Nuse_maptmap_save(Nuse_map, "figures/Nuse_map.png", width =600, height =600, asp =0)```## Map of N Use Efficiencies overtime```{r}library(tmap)tmap_mode("plot")Nuse_efficiency_map_yr <-tm_shape(tau.multi_fert.forest_X_sp) +tm_dots(col ="predictions", title ="NUE (kg Maize/Kg N)", style ="quantile") +tm_layout(legend.outside =TRUE) +tm_facets("year")Nuse_efficiency_map_yrtmap_save(Nuse_efficiency_map_yr, "figures/Nuse_efficiency_map_yr.png", width =600, height =600, asp =0)```# Mapping Descriptive Statistics```{r}library(geodata)TZ <-gadm(country ="TZ", level =1, path =tempdir())plot(TZ)TZ_sf=st_as_sf(TZ)TZ_sp=as_Spatial(TZ_sf)library(tmap)tm_shape(TZ_sp) +tm_polygons(col="grey",legend.show = F,border.col="white")+tmap_options(max.categories =31)+tm_text("NAME_1", size =0.7)tau.multi_fert.forest_X_sp$region_name <-over(tau.multi_fert.forest_X_sp, TZ_sp[,"NAME_1"])# Summary by region and by yearlibrary(modelsummary)tau.multi_fert.forest_X_sp_dt <-tau.multi_fert.forest_X_sp@datatau.multi_fert.forest_X_sp_dt$region_name_final=tau.multi_fert.forest_X_sp_dt$region_name$NAME_1NUE_cont <-datasummary(Heading("region_name")*region_name_final~Heading("N_obs") * N +Heading("percent") *Percent()+ predictions*(Mean+SD),data = tau.multi_fert.forest_X_sp_dt, output="data.frame")NUE_cont$N_obs=as.numeric(NUE_cont$N_obs)NUE_cont$Mean=as.numeric(NUE_cont$Mean)NUE_cont$SD=as.numeric(NUE_cont$SD)library(reactable)library(htmltools)library(fontawesome)htmltools::browsable(tagList( tags$button(tagList(fontawesome::fa("download"), "Download as CSV"),onclick ="Reactable.downloadDataCSV('NUE_cont', 'NUE_cont.csv')" ),reactable( NUE_cont,searchable =TRUE,defaultPageSize =38,elementId ="NUE_cont" ) ))# Remove regions with fewer observations than 50NUE_cont=subset(NUE_cont,NUE_cont$N_obs>50)NUE_cont_sf=merge(TZ_sf,NUE_cont, by.x="NAME_1",by.y="region_name")# Map the library(tmap)tmap_mode("plot")tm_shape(NUE_cont_sf) +tm_polygons(col ="Mean", title ="Average NUE", style ="quantile") +tm_layout(legend.outside =TRUE)```# Mapping NUEs by year and region```{r}mean_na <-function(x) mean(x, na.rm =TRUE)sd_na <-function(x) SD(x, na.rm =TRUE)p25_na <-function(x) quantile(x, probs =0.25, na.rm =TRUE)median_na <-function(x) median(x, na.rm =TRUE)p75_na <-function(x) quantile(x, probs =0.75, na.rm =TRUE)NUE_cont_region_year <-datasummary(Factor(year)+Factor(region_name_final)~Heading("N obs") * N +Heading("%") *Percent() + (predictions) * (mean_na + sd_na), data = tau.multi_fert.forest_X_sp_dt, output ="data.frame")library(reactable)library(htmltools)library(fontawesome)htmltools::browsable(tagList( tags$button(tagList(fontawesome::fa("download"), "Download as CSV"),onclick ="Reactable.downloadDataCSV('NUE_cont_region_year', 'NUE_cont_region_year.csv')" ),reactable( NUE_cont_region_year,searchable =TRUE,defaultPageSize =38,elementId ="NUE_cont_region_year" ) ))# Mixlibrary(data.table)tau.multi_fert.forest_X_sp_dt <-data.table (tau.multi_fert.forest_X_sp_dt)tau.multi_fert.forest_X_sp_dt$One=1NUE_cont_region_year_mix <- tau.multi_fert.forest_X_sp_dt[,.(Mean=base::mean(predictions, na.rm=TRUE),SD=sd(predictions, na.rm =TRUE),N_obs=sum(One, na.rm =TRUE),Maize_price_USD=base::mean(maipkg_usd, na.rm=TRUE),N_price_USD=base::mean(Npkg_usd, na.rm=TRUE),fert1_bin= base::mean(Npkg_usd, na.rm=TRUE),yld = base::mean(yld_, na.rm=TRUE),N_kgha = base::mean(N_kgha, na.rm=TRUE) ), by =c("year","region_name_final")]NUE_cont_region_year_mix$year_det[NUE_cont_region_year_mix$year==1]=2008NUE_cont_region_year_mix$year_det[NUE_cont_region_year_mix$year==2]=2010NUE_cont_region_year_mix$year_det[NUE_cont_region_year_mix$year==3]=2012NUE_cont_region_year_mix$year_det[NUE_cont_region_year_mix$year==4]=2014NUE_cont_region_year_mix$year_det[NUE_cont_region_year_mix$year==5]=2020library(reactable)library(htmltools)library(fontawesome)htmltools::browsable(tagList( tags$button(tagList(fontawesome::fa("download"), "Download as CSV"),onclick ="Reactable.downloadDataCSV('NUE_cont_region_year_mix', 'NUE_cont_region_year_mix.csv')" ),reactable( NUE_cont_region_year_mix,searchable =TRUE,defaultPageSize =38,elementId ="NUE_cont_region_year_mix" ) ))NUE_cont_region_year_mix=subset(NUE_cont_region_year_mix,NUE_cont_region_year_mix$N_obs>10)NUE_cont_region_year_mix_sf=merge(TZ_sf,NUE_cont_region_year_mix, by.x="NAME_1",by.y="region_name_final")library(tmap)tmap_mode("plot")NUE_year_region <-tm_shape(NUE_cont_region_year_mix_sf) +tm_polygons(col ="Mean", title ="NUE (kg Maize/Kg N)", style ="quantile") +tm_layout(legend.outside =TRUE) +tm_facets("year_det")NUE_year_regiontmap_save(NUE_year_region, "figures/NUE_year_region.png", width =600, height =600, asp =0)library(tmap)tmap_mode("plot")Yield_year_region <-tm_shape(NUE_cont_region_year_mix_sf) +tm_polygons(col ="yld", title ="Maize yield (kg/ha)", style ="quantile") +tm_layout(legend.outside =TRUE) +tm_facets("year_det")Yield_year_regiontmap_save(Yield_year_region, "figures/Yield_year_region.png", width =600, height =600, asp =0)```# Mapping partial profits by year and region```{r}library(rio)Maize_Prices_Tanzania_2021_dt_region <-import("admin_data/Maize_Prices_Tanzania_2021_dt_region.csv")Maize_Prices_Tanzania_2021_dt_region <-subset(Maize_Prices_Tanzania_2021_dt_region,select=c("Region","Maize_max_price_per_kg","Maize_min_price_per_kg"))Urea_Prices_Tanzania_sp_dt_region <-import ("admin_data/Urea_Prices_Tanzania_sp_dt_region.csv")Urea_Prices_Tanzania_sp_dt_region <-subset(Urea_Prices_Tanzania_sp_dt_region,select=c("region_name_final","mean_N_price","SD_N_price"))urea_maize_prices_region <-merge(Maize_Prices_Tanzania_2021_dt_region,Urea_Prices_Tanzania_sp_dt_region,by.x="Region",by.y="region_name_final")NUE_cont_region_year_mix_sf_prices=merge(NUE_cont_region_year_mix_sf,urea_maize_prices_region,by.x="NAME_1",by.y="Region")NUE_cont_region_year_mix_sf_prices$maize_rev_from_N = NUE_cont_region_year_mix_sf_prices$Mean*NUE_cont_region_year_mix_sf_prices$Maize_max_price_per_kgNUE_cont_region_year_mix_sf_prices$Survey_USD_maize_rev_from_N = NUE_cont_region_year_mix_sf_prices$Mean*NUE_cont_region_year_mix_sf_prices$Maize_price_USDNUE_cont_region_year_mix_sf_prices$partial_profit= NUE_cont_region_year_mix_sf_prices$maize_rev_from_N-NUE_cont_region_year_mix_sf_prices$mean_N_priceNUE_cont_region_year_mix_sf_prices$Survey_USD_partial_profit= NUE_cont_region_year_mix_sf_prices$Survey_USD_maize_rev_from_N-NUE_cont_region_year_mix_sf_prices$N_price_USDNUE_cont_region_year_mix_sf_prices$vcr=NUE_cont_region_year_mix_sf_prices$maize_rev_from_N/NUE_cont_region_year_mix_sf_prices$mean_N_priceNUE_cont_region_year_mix_sf_prices$Survey_USD_vcr=NUE_cont_region_year_mix_sf_prices$Survey_USD_maize_rev_from_N/NUE_cont_region_year_mix_sf_prices$N_price_USDlibrary(sf)NUE_cont_region_year_mix_sf_prices_sp <-as_Spatial(NUE_cont_region_year_mix_sf_prices)NUE_cont_region_year_mix_sf_prices_sp_dt <- NUE_cont_region_year_mix_sf_prices_sp@dataNUE_cont_region_year_mix_sf_prices_sp_dt <-as.data.frame(NUE_cont_region_year_mix_sf_prices_sp_dt)# Export the tablelibrary(reactable)library(htmltools)library(fontawesome)htmltools::browsable(tagList( tags$button(tagList(fontawesome::fa("download"), "Download as CSV"),onclick ="Reactable.downloadDataCSV('NUE_cont_region_year_mix_sf_prices_sp_dt', 'NUE_cont_region_year_mix_sf_prices_sp_dt.csv')" ),reactable( NUE_cont_region_year_mix_sf_prices_sp_dt,searchable =TRUE,defaultPageSize =38,elementId ="NUE_cont_region_year_mix_sf_prices_sp_dt" ) ))# N price library(tmap)tmap_mode("plot")N_price_year_region <-tm_shape(NUE_cont_region_year_mix_sf_prices) +tm_polygons(col ="mean_N_price", title ="Nitrogen price (TZS/kg)", style ="quantile") +tm_layout(legend.outside =TRUE) N_price_year_regiontmap_save(N_price_year_region, "figures/N_price_year_region.png", width =600, height =600)library(tmap)tmap_mode("plot")Survey_USD_N_price_year_region <-tm_shape(NUE_cont_region_year_mix_sf_prices) +tm_polygons(col ="N_price_USD", title ="Nitrogen price (USD/kg)", style ="quantile") +tm_layout(legend.outside =TRUE) Survey_USD_N_price_year_regiontmap_save(Survey_USD_N_price_year_region, "figures/Survey_USD_N_price_year_region.png", width =600, height =600)# Maize output price library(tmap)tmap_mode("plot")maize_price_year_region <-tm_shape(NUE_cont_region_year_mix_sf_prices) +tm_polygons(col ="Maize_max_price_per_kg", title ="2021 maize prices (TZS/kg)", style ="quantile") +tm_layout(legend.outside =TRUE) maize_price_year_regiontmap_save(maize_price_year_region, "figures/maize_price_year_region.png", width =600, height =600)library(tmap)tmap_mode("plot")Survey_USD_maize_price_year_region <-tm_shape(NUE_cont_region_year_mix_sf_prices) +tm_polygons(col ="Maize_price_USD", title ="Maize prices (USD/kg)", style ="quantile") +tm_layout(legend.outside =TRUE) Survey_USD_maize_price_year_regiontmap_save(Survey_USD_maize_price_year_region, "figures/Survey_USD_maize_price_year_region.png", width =600, height =600)# Value cost ratiolibrary(tmap)tmap_mode("plot")vcr_year_region <-tm_shape(NUE_cont_region_year_mix_sf_prices) +tm_polygons(col ="vcr", title ="Maize value-cost ratio (VCR)", style ="quantile") +tm_layout(legend.outside =TRUE) +tm_facets("year_det")vcr_year_regiontmap_save(vcr_year_region, "figures/vcr_year_region.png", width =600, height =600)library(tmap)tmap_mode("plot")Survey_USD_vcr_year_region <-tm_shape(NUE_cont_region_year_mix_sf_prices) +tm_polygons(col ="Survey_USD_vcr", title ="Maize value-cost ratio (VCR)", style ="quantile") +tm_layout(legend.outside =TRUE) +tm_facets("year_det")Survey_USD_vcr_year_regiontmap_save(Survey_USD_vcr_year_region, "figures/Survey_USD_vcr_year_region.png", width =600, height =600)# Partial profit to nitrogen applicationlibrary(tmap)tmap_mode("plot")partial_prof_year_region <-tm_shape(NUE_cont_region_year_mix_sf_prices) +tm_polygons(col ="partial_profit", title ="Partial profit (TZS /Kg N)", style ="quantile") +tm_layout(legend.outside =TRUE) +tm_facets("year_det")partial_prof_year_regiontmap_save(partial_prof_year_region, "figures/partial_prof_year_region.png", width =600, height =600)library(tmap)tmap_mode("plot")Survey_USD_partial_prof_year_region <-tm_shape(NUE_cont_region_year_mix_sf_prices) +tm_polygons(col ="Survey_USD_partial_profit", title ="Partial profit (USD /Kg N)", style ="quantile") +tm_layout(legend.outside =TRUE) +tm_facets("year_det")Survey_USD_partial_prof_year_regiontmap_save(Survey_USD_partial_prof_year_region, "figures/Survey_USD_partial_prof_year_region.png", width =600, height =600)```# Spatially Gridded Profitability Analysis## Spatially gridded NUEs```{r}# Gridded yield gainslon <- tau.multi_fert.forest_X$lon_modifiedlon <-as.data.frame(lon)lat <- tau.multi_fert.forest_X$lat_modifiedlat <-as.data.frame(lat)tau.multi_fert.forest_X <-cbind(lon, lat, tau.multi_fert.forest_X)library(sp)tau.multi_fert.forest_X_sp_kr <-SpatialPointsDataFrame(cbind(tau.multi_fert.forest_X$lon, tau.multi_fert.forest_X$lat), data = tau.multi_fert.forest_X, proj4string =CRS("+proj=longlat +datum=WGS84"))# tau.multi_fert.forest_X_sp_kr_2008 =subset(tau.multi_fert.forest_X_sp_kr,tau.multi_fert.forest_X_sp_kr$year==1)# 2008 prediction library(bamlss)set.seed(111)f <- predictions ~s(lon,lat)## estimate model.b <-bamlss(f, data = tau.multi_fert.forest_X_sp_kr)plot(b)# Boundary maplibrary(sf)library(geodata)set.seed(111)TZ <-gadm(country ="TZ", level =0, path =tempdir())plot(TZ)TZ_sf=st_as_sf(TZ)TZ_sp=as_Spatial(TZ_sf)TZ_sf_dis <-st_union(TZ_sf)plot(TZ_sf_dis)# Predictelevationglobal_geodata <-elevation_global(2.5, path =tempdir())elevationglobal_geodata_aoi <- terra::crop(elevationglobal_geodata,TZ_sf_dis)library(raster)elevationglobal_geodata_aoi <-raster(elevationglobal_geodata_aoi)plot(elevationglobal_geodata_aoi)pred <-SpatialPoints(elevationglobal_geodata_aoi)@coordspred <-as.data.frame(pred)names(pred)[1:2] <-c("lon", "lat")pred <- pred %>%drop_na()library(sp)pred_sp <-SpatialPointsDataFrame(cbind(pred$lon, pred$lat), data = pred, proj4string =CRS("+proj=longlat +datum=WGS84"))yield_gain_hat <-predict(b, newdata = pred)yield_gain_hat <-as.data.frame(yield_gain_hat)pred_yield_gain_hat <-cbind(pred, yield_gain_hat )pred_yield_gain_hat$sigma <-NULLlibrary(terra)yield_gain_ras <-rast(pred_yield_gain_hat, type ="xyz")plot(yield_gain_ras )library(raster)yield_gain_ras2 <-raster(yield_gain_ras)library(sf)TZ_sf_dis_sp <-as_Spatial(TZ_sf_dis)yield_gain_ras2 <-mask(yield_gain_ras2, TZ_sf_dis_sp)plot(yield_gain_ras2, main ="Yield gains to N")library(rasterVis)levelplot(yield_gain_ras2, layers =1, margin =list(FUN ='median'), contour=TRUE)## library(tmap)yield_gain_raster_map=tm_shape(yield_gain_ras2) +tm_raster(title ="Nitrogen Use Efficiency",style ="cont", palette ="viridis")+tm_legend(position =c("right", "top"),legend.title.size=1.2,legend.text.size=1.2)yield_gain_raster_maptmap_save(yield_gain_raster_map,"figures/yield_gain_raster_map.png", dpi=300,width=4.88,height =4.16)```## VCR predictions```{r}# library(bamlss)# set.seed(111)# f <- vcr ~ s(lon,lat,year)# # ## estimate model.# b <- bamlss(f, data = tau.multi_fert.forest_X_sp_kr)# # plot(b)# # # Boundary map# library(sf)# library(geodata)# set.seed(111)# # TZ <- gadm(country = "TZ", level = 0, path = tempdir())# plot(TZ)# # TZ_sf= st_as_sf(TZ)# TZ_sp=as_Spatial(TZ_sf)# # TZ_sf_dis <- st_union(TZ_sf)# plot(TZ_sf_dis)# # # # Predict# elevationglobal_geodata <- elevation_global(2.5, path =tempdir())# # elevationglobal_geodata_aoi <- terra::crop(elevationglobal_geodata,TZ_sf_dis)# # library(raster)# elevationglobal_geodata_aoi <- raster(elevationglobal_geodata_aoi)# # plot(elevationglobal_geodata_aoi)# # pred <- SpatialPoints(elevationglobal_geodata_aoi)@coords# pred <- as.data.frame(pred)# # names(pred)[1:2] <- c("lon", "lat")# # pred <- pred %>% drop_na()# # library(sp)# pred_sp <-# SpatialPointsDataFrame(cbind(pred$lon, pred$lat), data = pred, proj4string = CRS("+proj=longlat +datum=WGS84"))# # # yield_gain_hat <- predict(b, newdata = pred)# # yield_gain_hat <- as.data.frame(yield_gain_hat)# # pred_yield_gain_hat <- cbind(pred, yield_gain_hat )# # pred_yield_gain_hat$sigma <- NULL# library(terra)# # yield_gain_ras <- rast(pred_yield_gain_hat, type = "xyz")# plot(yield_gain_ras )# # library(raster)# yield_gain_ras2 <- raster(yield_gain_ras)# # library(sf)# TZ_sf_dis_sp <- as_Spatial(TZ_sf_dis)# yield_gain_ras2 <- mask(yield_gain_ras2, TZ_sf_dis_sp)# plot(yield_gain_ras2, main = "Yield gains to N")# # library(rasterVis)# # levelplot(yield_gain_ras2, layers = 1, margin = list(FUN = 'median'), contour=TRUE)# # ## # library(tmap)# yield_gain_raster_map=tm_shape(yield_gain_ras2) +# tm_raster(title = "Nitrogen Use Efficiency",style = "cont", palette = "viridis")+# tm_legend(position = c("right", "top"),legend.title.size=1.2,legend.text.size=1.2)# yield_gain_raster_map# # tmap_save(yield_gain_raster_map,"figures/yield_gain_raster_map.png", dpi=300,width=4.88,height = 4.16)```## Profit predictions# Conclusion# ReferencesAlston, J.M., Norton, G.W., and Pardey, P. 1995. “Science under scarcity: Principles and practice for agricultural research evaluation and priority setting”. CAB International. See Chapter 7 (pp. 463-498).Athey, S., Tibshirani, J., and Wager, S. 2019. “Generalized Random Forests”. Annals of Statistics 47 (2): 1148-1178. Doi: 10.1214/18-AOS1709.