In this document, we’ll walk through the analysis for all of the data from the paper introducing the MODES toolbox. Most of the commentary was written at the point of initial analysis, to make it clear which statistical tests were done post-hoc. Some details were filled in after the fact to improve readability.

Setup

First we need to set some stuff up and read in the data.

# Do set up and collect data
library(ggplot2)
library(dplyr)
library(tidyr)

# flat-violin geom will be needed for raincloud plots
source("geom_flat_violin.R")

# Function for calculating Glass's delta (effect size)
glass_delta <- function(m_exp, m_cont, sd_cont) {return (m_exp - m_cont)/sd_cont}

# Set up theme for all plots
theme_set(theme_classic())
theme_update(axis.title = element_text(size=18), axis.text = element_text(size=12), legend.text = element_text(size=14), legend.title = element_text(size=14), legend.position = "bottom", strip.text = element_text(size=18))

nk_data <- read.csv("../data/nk_data.csv")
avida_data <- read.csv("../data/avida_data.csv")

Okay, now we’re ready to explore the data

Avida analysis

Let’s start with the Avida data. We’ll start by setting up some useful subsets of the data

# Just the data from the empty environment (for exploring the effect of filter length)
empty <- avida_data %>% filter(res == 100, environment %in% c("empty"), filter %in% c(500, 1000, 2000, 4000), fixedlength == 0)
# Just the endpoints of the empty environment (for looking at the final distribution of metrics)
empty_end <- avida_data %>% filter(res == 100, environment %in% c("empty"), filter %in% c(500, 1000, 2000, 4000), fixedlength == 0, generation == 200000)
# Subset of data for comparing the empty environment to the logic-9 environment (exclude the filter length sweep)
env_comp <- avida_data %>% filter(res == 100, environment %in% c("empty", "l9"), filter == population_size, fixedlength == 0)
# And just the end points of that data
env_comp_end <- avida_data %>% filter(res == 100, environment %in% c("empty", "l9"), filter == population_size, fixedlength == 0, generation == 200000)

Change

First let’s take a look at the change metric

Filter length

# Line plot of change over time with bootstrapped confidence interval. Show separate lines for each filter length and make a plot for each population size
ggplot(data=empty) + 
  stat_summary(aes(x=generation, y=change, color=as.factor(filter), fill=as.factor(filter)), fun.data="mean_cl_boot", geom="smooth") + # Make lines + confidence interval
  facet_wrap(~population_size) + # Different plot for each population size
  scale_x_continuous("Average Generation", labels = function(x) format(x, scientific = TRUE), expand=c(0.15, 0.15), limits=c(0,200000), breaks=c(0, 100000, 200000)) + # Make sure axis labels don't overlap
  scale_y_continuous("Change") + # Label y axis
  scale_color_discrete("Filter length (t)") + scale_fill_discrete("Filter length (t)")  + # Label legend
  geom_hline(yintercept=c(1,1), linetype="dotted") # Add line showing expected value
## Warning: Removed 1 rows containing non-finite values (stat_summary).

ggsave("../figs/avida_filter_change.png") # Save figure
## Warning: Removed 1 rows containing non-finite values (stat_summary).

Okay, looks like the effect of the filter length is roughly as expected. Higher filter times more closely approximate the approximate ground truth (change=1). It’s a little hard to tell how close they get from this view. Let’s try a rain-cloud plot:

# Rain-cloud plots are a cool new way to visualize distributions
# Lets make one of change at the final time point so we can get an accurate read on the types of results we'd expect to see from using a given filter length
ggplot(data=empty_end, aes(x=as.factor(filter), y=change, fill=as.factor(filter))) + 
  geom_flat_violin(position = position_nudge(x = .2, y = 0)) + # Flat violin plot summarizes shape of data
  geom_point(aes(color = as.factor(filter)), position = position_jitter(width = .15), alpha = 0.8) + # Show actual data points 
  stat_summary(fun.data="mean_cl_boot", show.legend = FALSE) + # Draw mean and bootstrapped confidence interval
  facet_wrap(~population_size, scales = "free_x") + # Use different x axes for each plot because 4000 only exists for population size 2000
  scale_x_discrete("Filter length (t)") + # label axes + legends
  scale_y_continuous("Change") + 
  scale_color_discrete("Filter length (t)") + 
  scale_fill_discrete("Filter length (t)") + 
  geom_hline(yintercept=c(1,1), linetype="dotted") + # Show expected changed value
  theme(legend.position="None") # The axis and the legend are redundant

ggsave("../figs/avida_filter_change_end.png")

Okay, from this we can tell that using population size as t always produces reasonable results. Population size = t seems to get a little less noisey as population size increases, possibly due to the increased selection strength. The results of t=4000 for population size 2000 make it look like the actual ground truth is potentially a little below 1, which is reasonable (we don’t necessarily expect change to occur in ever interval as evolution goes on). Even using half the population size as t appears to be generally reasonable. Once you go lower than that, though, things start to get pretty noisy. Using t greater than population size appears to clean things up a bit more, but there are diminishing returns. Based on this, we’d recommend population size or 2*population size as filter size when possible.

Environment

Okay, what about the effect of the environment?

# Line plot of change over time with bootstrapped confidence interval. Show separate lines for each population size and make a plot for each environment
ggplot(data=env_comp) + stat_summary(aes(x=generation, y=change, color=as.factor(population_size), fill=as.factor(population_size)), fun.data="mean_cl_boot", geom="smooth") + facet_wrap(~environment, scales = "free_y") + scale_x_continuous("Average Generation", labels = function(x) format(x, scientific = TRUE), expand=c(0.1, 0.1), limits=c(0,200000), breaks=c(0, 100000, 200000)) + scale_y_continuous("Change") + scale_color_discrete("Population size") + scale_fill_discrete("Population size")
## Warning: Removed 1 rows containing non-finite values (stat_summary).

ggsave("../figs/avida_env_change.png")
## Warning: Removed 1 rows containing non-finite values (stat_summary).

In general, logic-9 is a lot higher (hence having to use free_y scales on the facet wrap). The effect of population size also seems to be potentially reversed between the two environments? In logic-9 larger populations allow for more change (which makes sense because it lets the population explore the fitness landscape in different directions). Perhaps the empty environment is reversed because the larger population size increases selection pressure and there’s nothing to balance it out?

It’s a little hard to see what’s going on at the end, so let’s make another rain-cloud plot

ggplot(data=env_comp_end, aes(x=as.factor(population_size), y=change, fill=as.factor(population_size))) + geom_flat_violin(position = position_nudge(x = .2, y = 0)) + geom_point(aes(color = as.factor(filter)), position = position_jitter(width = .15), alpha = 0.8) + stat_summary(fun.data="mean_cl_boot", show.legend = FALSE) + facet_wrap(~environment, scales = "free_x") + scale_x_discrete("Population size") + scale_y_continuous("Change") + geom_hline(yintercept=c(1,1), linetype="dotted")  + theme(legend.position="None")

ggsave("../figs/avida_env_change_end.png")

Okay, this makes it clearer. There’s really not much of an effect of population size on change in the empty environment. In Logic-9, the increase is primarily the result of the tail of the distribution getting longer.

Novelty

What about novelty? Let’s make the same plots.

ggplot(data=empty) + stat_summary(aes(x=generation, y=novelty, color=as.factor(filter), fill=as.factor(filter)), fun.data="mean_cl_boot", geom="smooth") + facet_wrap(~population_size) + scale_x_continuous("Average Generation", labels = function(x) format(x, scientific = TRUE), expand=c(0.15, 0.15), limits=c(0,200000), breaks=c(0, 100000, 200000)) + scale_y_continuous("Novelty") + scale_color_discrete("Filter length (t)")+ scale_fill_discrete("Filter length (t)")  + geom_hline(yintercept=c(1,1), linetype="dotted")
## Warning: Removed 1 rows containing non-finite values (stat_summary).

ggsave("../figs/avida_filter_novelty.png")
## Warning: Removed 1 rows containing non-finite values (stat_summary).
ggplot(data=empty_end, aes(x=as.factor(filter), y=novelty, fill=as.factor(filter))) + geom_flat_violin(position = position_nudge(x = .2, y = 0)) + geom_point(aes(color = as.factor(filter)), position = position_jitter(width = .15), alpha = 0.8) + stat_summary(fun.data="mean_cl_boot", show.legend = FALSE) + facet_wrap(~population_size, scales = "free_x") + scale_x_discrete("Filter length (t)") + scale_y_continuous("Novelty") + scale_color_discrete("Filter length (t)")+ scale_fill_discrete("Filter length (t)") + geom_hline(yintercept=c(1,1), linetype="dotted") + theme(legend.position="None") + theme(legend.position="None")

ggsave("../figs/avida_filter_novelty_end.png")

Novelty is almsot identical to change. This is consistent with fitness continuously increaseing (there shouldn’t be opportunities for back-tracking)

Environment

And what about the effect of environment on novelty?

ggplot(data=env_comp) + stat_summary(aes(x=generation, y=novelty, color=as.factor(population_size), fill=as.factor(population_size)), fun.data="mean_cl_boot", geom="smooth") + facet_wrap(~environment, scales = "free_y") + scale_x_continuous("Average Generation", labels = function(x) format(x, scientific = TRUE), expand=c(0.1, 0.1), limits=c(0,200000), breaks=c(0, 100000, 200000)) + scale_y_continuous("Novelty") + scale_color_discrete("Population size") + scale_fill_discrete("Population size")
## Warning: Removed 1 rows containing non-finite values (stat_summary).

ggsave("../figs/avida_env_novelty.png")
## Warning: Removed 1 rows containing non-finite values (stat_summary).
ggplot(data=env_comp_end, aes(x=as.factor(population_size), y=novelty, fill=as.factor(population_size))) + geom_flat_violin(position = position_nudge(x = .2, y = 0)) + geom_point(aes(color = as.factor(filter)), position = position_jitter(width = .15), alpha = 0.8) + stat_summary(fun.data="mean_cl_boot", show.legend = FALSE) + facet_wrap(~environment, scales = "free_x") + scale_x_discrete("Population size") + scale_y_continuous("Novelty") + theme(legend.position="None")

ggsave("../figs/avida_env_novelty_end.png")

Again, this is basically identical to the change results.

Ecology

Okay, let’s check out ecology.

ggplot(data=empty) + stat_summary(aes(x=generation, y=ecology, color=as.factor(filter), fill=as.factor(filter)), fun.data="mean_cl_boot", geom="smooth") + facet_wrap(~population_size) + scale_x_continuous("Average Generation", labels = function(x) format(x, scientific = TRUE), expand=c(0.15, 0.15), limits=c(0,200000), breaks=c(0, 100000, 200000)) + scale_y_continuous("Ecology") + scale_color_discrete("Filter length (t)")+ scale_fill_discrete("Filter length (t)")
## Warning: Removed 1 rows containing non-finite values (stat_summary).

ggsave("../figs/avida_filter_ecology.png")
## Warning: Removed 1 rows containing non-finite values (stat_summary).

Note sure if that curving pattern is real or what’s up with it. Interestingly, despite change generally being pretty close to 1, ecology is also pretty close to 1. This implies that the new genotypes registering as “change” don’t wipe everything else out that rapidly - there must be genotypes hanging around for long periods of time.

ggplot(data=empty_end, aes(x=as.factor(filter), y=ecology, fill=as.factor(filter))) + geom_flat_violin(position = position_nudge(x = .2, y = 0)) + geom_point(aes(color = as.factor(filter)), position = position_jitter(width = .15), alpha = 0.8) + stat_summary(fun.data="mean_cl_boot", show.legend = FALSE) + facet_wrap(~population_size, scales = "free_x") + scale_x_discrete("Filter length (t)") + scale_y_continuous("Ecology") + scale_color_discrete("Filter length (t)")+ scale_fill_discrete("Filter length (t)")  + theme(legend.position="None")

ggsave("../figs/avida_filter_ecology_end.png")

Ecology seems a little more robust to using a low filter time (the tail on 500 in population size 2000 is much more reasonable), which makes sense, because things that are on their way out probably aren’t very plentiful. The change in selection strength induced by increased population size doesn’t seem to have a noticeable effect (filter time = population size prdocues equivalent ecology across population size). This also implies that the reason for low ecology is not a constraint induced by the population size (which is obvious in this case, because we know this is a single niche environment, but wouldn’t be obvious if we didn’t know as much about the system)

Environment

What about the effect of environment?

ggplot(data=env_comp) + stat_summary(aes(x=generation, y=ecology, color=as.factor(population_size), fill=as.factor(population_size)), fun.data="mean_cl_boot", geom="smooth") + facet_wrap(~environment, scales = "free_y") + scale_x_continuous("Average Generation", labels = function(x) format(x, scientific = TRUE), expand=c(0.1, 0.1), limits=c(0,200000), breaks=c(0, 100000, 200000)) + scale_y_continuous("Ecology") + scale_color_discrete("Population size") + scale_fill_discrete("Population size")
## Warning: Removed 1 rows containing non-finite values (stat_summary).

ggsave("../figs/avida_env_ecology.png")
## Warning: Removed 1 rows containing non-finite values (stat_summary).
ggplot(data=env_comp_end, aes(x=as.factor(population_size), y=ecology, fill=as.factor(population_size))) + geom_flat_violin(position = position_nudge(x = .2, y = 0)) + geom_point(aes(color = as.factor(filter)), position = position_jitter(width = .15), alpha = 0.8) + stat_summary(fun.data="mean_cl_boot", show.legend = FALSE) + facet_wrap(~environment, scales = "free_x") + scale_x_discrete("Population size") + scale_y_continuous("Ecology") +  theme(legend.position="None")

ggsave("../figs/avida_env_ecology_end.png")

Since these are both single niche environments, this is about what we would expect. Ecology is pretty constant and no higher than 1.

Complexity

Last but not least, there’s complexity.

ggplot(data=empty) + stat_summary(aes(x=generation, y=complexity, color=as.factor(filter), fill=as.factor(filter)), fun.data="mean_cl_boot", geom="smooth") + facet_wrap(~population_size) + scale_x_continuous("Average Generation", labels = function(x) format(x, scientific = TRUE), expand=c(0.15, 0.15), limits=c(0,200000), breaks=c(0, 100000, 200000)) + scale_y_continuous("Complexity") + scale_color_discrete("Filter length (t)")+ scale_fill_discrete("Filter length (t)")
## Warning: Removed 1 rows containing non-finite values (stat_summary).

ggsave("../figs/avida_filter_complexity.png")
## Warning: Removed 1 rows containing non-finite values (stat_summary).

Basically, things rapdily increase in complexity as they become better than the ancestor. Then they decrease as they find ways to optimize, and soon appoximately level out.

ggplot(data=empty_end, aes(x=as.factor(filter), y=complexity, fill=as.factor(filter))) + geom_flat_violin(position = position_nudge(x = .2, y = 0)) + geom_point(aes(color = as.factor(filter)), position = position_jitter(width = .15), alpha = 0.8) + stat_summary(fun.data="mean_cl_boot", show.legend = FALSE) + facet_wrap(~population_size, scales = "free_x") + scale_x_discrete("Filter length (t)") + scale_y_continuous("Complexity") + scale_color_discrete("Filter length (t)")+ scale_fill_discrete("Filter length (t)")  + theme(legend.position="None")

ggsave("../figs/avida_filter_complexity_end.png")

By the end, complexity is pretty consistent in all of the populations, so there’s not much going on here.

Environment

ggplot(data=env_comp) + stat_summary(aes(x=generation, y=complexity, color=as.factor(population_size), fill=as.factor(population_size)), fun.data="mean_cl_boot", geom="smooth") + facet_wrap(~environment, scales = "free_y") + scale_x_continuous("Average Generation", labels = function(x) format(x, scientific = TRUE), expand=c(0.1, 0.1), limits=c(0,200000), breaks=c(0, 100000, 200000)) + scale_y_continuous("Complexity") + scale_color_discrete("Population size") + scale_fill_discrete("Population size")
## Warning: Removed 1 rows containing non-finite values (stat_summary).

ggsave("../figs/avida_env_complexity.png")
## Warning: Removed 1 rows containing non-finite values (stat_summary).

These certainly have the appearance of a different long-term trend. We’d need to do stats, but it seemes like there’s a good chance logic-9 keeps going up. Either way, logic-9 is clearly higher, which makes sense because the environment is more complex.

ggplot(data=env_comp_end, aes(x=as.factor(population_size), y=complexity, fill=as.factor(population_size))) + geom_flat_violin(position = position_nudge(x = .2, y = 0)) + geom_point(aes(color = as.factor(filter)), position = position_jitter(width = .15), alpha = 0.8) + stat_summary(fun.data="mean_cl_boot", show.legend = FALSE) + facet_wrap(~environment, scales = "free_x") + scale_x_discrete("Population size") + scale_y_continuous("Complexity") + geom_hline(yintercept=c(1,1), linetype="dotted")  + theme(legend.position="None")

ggsave("../figs/avida_env_complexity_end.png")

Driving home the point that logic-9 has higher complexity. Interesting that population size has so little effect.

NK Landscape Analysis

First we need to clean up the NK dataframe a little

# Label treatments correctly
nk_data <- transform(nk_data, treatment = case_when(
    K == 10 ~ "High K\n(10)", 
    N == 100 ~ "High N\n(100)", 
    MUT_RATE == .005 ~ "Low\nmutation\n(.005)", 
    MUT_RATE == .1 ~ "High\nmutation\n(.1)",
    POP_SIZE == 20 ~ "Small\npop\n(20)",
    POP_SIZE == 2000 ~ "Large\npop\n(2000)",
    CHANGE_TYPE == 1 ~ "Oscillating\nenvironment",
    CHANGE_RATE == 500 ~ "Changing\nenvironment",
    SELECTION == 1 ~ "Fitness\nsharing",
    TRUE   ~ "Base" 
))

# Make things that we're treating as factors into factors
nk_data$POP_SIZE <- as.factor(nk_data$POP_SIZE)
nk_data$MUT_RATE <- as.factor(nk_data$MUT_RATE)

# Order treatments such that axis labels don't overlap
nk_data$treatment <- factor(nk_data$treatment, levels=c("Base",  "Low\nmutation\n(.005)","High\nmutation\n(.1)","Small\npop\n(20)","Large\npop\n(2000)","Oscillating\nenvironment","High K\n(10)","High N\n(100)", "Changing\nenvironment", "Fitness\nsharing"))

# Grab some useful subset of data

# The final generation for all runs
nk_endpoints <- nk_data %>% filter(generation == 5000)

# Only the runs where we used t equal to population size
nk_filter_pop_size <- nk_data %>% filter(FILTER_LENGTH == POP_SIZE, treatment != "Changing\nenvironment")
# The ends of those runs
nk_filter_pop_size_endpoints <- nk_filter_pop_size %>% filter(generation == 5000)

Change

First lets take a look at change over time across conditions

ggplot(data=nk_filter_pop_size) + stat_summary(fun.data = mean_cl_boot, aes(y=change, x=generation, color=treatment, fill=treatment), geom="smooth") + scale_x_continuous("Generation") + scale_y_continuous("Change")

Wow, that’s too many lines. As expected, Base, Fitness Sharing, and Oscillating are the only ones really doing interesting stuff over time (the spike in large pop is just the change it detects when the filter time first passes). Let’s make a plot with just them.

ggplot(data=nk_filter_pop_size %>% filter(treatment %in% c("Base", "Fitness\nsharing", "Oscillating\nenvironment"))) + stat_summary(fun.data = mean_cl_boot, aes(y=change, x=generation, color=treatment, fill=treatment), geom="smooth") + scale_x_continuous("Generation") + scale_y_continuous("Change")  + scale_color_discrete("") + scale_fill_discrete("")

ggsave("../figs/change_changing_environments.png")

And let’s take a look at the end points to summarize all the conditions

ggplot(data=nk_filter_pop_size_endpoints, aes(x=treatment, y=change, fill=treatment)) + geom_flat_violin(position = position_nudge(x = .2, y = 0)) + geom_point(aes(color = treatment), position = position_jitter(height = .1, width = .15), alpha = 0.8) + stat_summary(fun.data="mean_cl_boot", show.legend = FALSE) + scale_y_continuous("Change")  + scale_x_discrete("") + theme(legend.position="None")

ggsave("../figs/changeboxplots.png")

Stats

And now lets make sure that the things that look significantly different actually are. First we do a Kruskal-Wallis test to see if there are any differences among groups:

kruskal.test(change ~ treatment, data = nk_filter_pop_size_endpoints)
## 
##  Kruskal-Wallis rank sum test
## 
## data:  change by treatment
## Kruskal-Wallis chi-squared = 133.28, df = 8, p-value < 2.2e-16

Looks like there are. Now let’s do post-hoc Wilcox tests vs the base treatment with a Bonferonni correction for multiple comparisons to see which of these differences are actually significant.

p_vals <- data.frame(stringsAsFactors = FALSE)
base_change <- nk_filter_pop_size_endpoints %>% filter(treatment == "Base")
for (t in unique(as.character(nk_filter_pop_size_endpoints$treatment))) {
    treat_change <- nk_filter_pop_size_endpoints %>% filter(treatment == t)
    w <- wilcox.test(base_change$change, treat_change$change, exact = FALSE)
    p_vals <- as.data.frame(rbind(p_vals, list(t, w$p.value)), stringsAsFactors=FALSE)
    colnames(p_vals) <- c("treatment", "p")
    p_vals$treatment <- as.character(p_vals$treatment)
}

p_vals$sig <- p_vals$p < .05/length(p_vals$p)

p_vals
##                  treatment            p   sig
## 1 Oscillating\nenvironment 6.543330e-01 FALSE
## 2         Fitness\nsharing 1.969287e-05  TRUE
## 3         Small\npop\n(20) 3.910521e-03  TRUE
## 4     High\nmutation\n(.1) 4.484584e-09  TRUE
## 5             High K\n(10) 6.543330e-01 FALSE
## 6                     Base 1.000000e+00 FALSE
## 7       Large\npop\n(2000) 3.128169e-01 FALSE
## 8            High N\n(100) 8.140422e-02 FALSE
## 9    Low\nmutation\n(.005) 8.140422e-02 FALSE

Yep, that’s about what we’d expect (since oscillating environment doesn’t end on an oscilation).

Novelty

ggplot(data=nk_filter_pop_size) + stat_summary(fun.data = mean_cl_boot, aes(y=novelty, x=generation, color=treatment, fill=treatment), geom="smooth") + scale_x_continuous("Generation") + scale_y_continuous("Novelty") + scale_color_discrete("") + scale_fill_discrete("")

Again, too many lines, but we can see that there is a general downward trend amongst those that start high enough to trend downward.

Let’s plot an informative subset of the conditions (changing mutation rates):

ggplot(data=nk_filter_pop_size %>% filter(treatment %in% c("Base", "Low\nmutation\n(.005)", "High\nmutation\n(.1)"))) + stat_summary(fun.data = mean_cl_boot, aes(y=novelty, x=generation, color=treatment, fill=treatment), geom="smooth") + scale_x_continuous("Generation") + scale_y_continuous("Novelty") + scale_color_discrete("") + scale_fill_discrete("")

ggsave("../figs/novelty_mean_mut_rate.png")

And a summary of the endpoints of everything

ggplot(data=nk_filter_pop_size_endpoints, aes(x=treatment, y=novelty, fill=treatment)) + geom_flat_violin(position = position_nudge(x = .2, y = 0)) + geom_point(aes(color = treatment), position = position_jitter(height = .1, width = .15), alpha = 0.8) + stat_summary(fun.data="mean_cl_boot", show.legend = FALSE) + scale_y_continuous("Novelty")  + scale_x_discrete("") + theme(legend.position="None")

ggsave("../figs/noveltyboxplots.png")

Stats

And now lets make sure that the things that look significantly different actually are. First we do a Kruskal-Wallis test to see if there are any differences among groups:

kruskal.test(novelty ~ treatment, data = nk_filter_pop_size_endpoints)
## 
##  Kruskal-Wallis rank sum test
## 
## data:  novelty by treatment
## Kruskal-Wallis chi-squared = 62.479, df = 8, p-value = 1.517e-10

Looks like there are. Now let’s do post-hoc Wilcox tests vs the base treatment with a Bonferonni correction for multiple comparisons to see which of these differences are actually significant.

p_vals <- data.frame(stringsAsFactors = FALSE)
base_novelty <- nk_filter_pop_size_endpoints %>% filter(treatment == "Base")
for (t in unique(as.character(nk_filter_pop_size_endpoints$treatment))) {
    treat_novelty <- nk_filter_pop_size_endpoints %>% filter(treatment == t)
    w <- wilcox.test(base_novelty$novelty, treat_novelty$novelty, exact = FALSE)
    p_vals <- as.data.frame(rbind(p_vals, list(t, w$p.value)), stringsAsFactors=FALSE)
    colnames(p_vals) <- c("treatment", "p")
    p_vals$treatment <- as.character(p_vals$treatment)
}

p_vals$sig <- p_vals$p < .05/length(p_vals$p)

p_vals
##                  treatment          p   sig
## 1 Oscillating\nenvironment 0.33371070 FALSE
## 2         Fitness\nsharing 0.33371070 FALSE
## 3         Small\npop\n(20) 0.57016252 FALSE
## 4     High\nmutation\n(.1) 0.00301088  TRUE
## 5             High K\n(10) 0.33371070 FALSE
## 6                     Base 1.00000000 FALSE
## 7       Large\npop\n(2000) 0.33371070 FALSE
## 8            High N\n(100) 0.33371070 FALSE
## 9    Low\nmutation\n(.005) 0.33371070 FALSE

Wow, it’s really just the high mutation condition that’s significant. That’s a little surprising, but it looks consistent with the graphs.

Ecology

ggplot(data=nk_filter_pop_size) + stat_summary(fun.data = mean_cl_boot, aes(y=ecology, x=generation, color=treatment, fill=treatment), geom="smooth") + scale_x_continuous("Generation") + scale_y_continuous("Ecology") + scale_color_discrete("") + scale_fill_discrete("")

Really, the main take-away here is that fitness sharing creates ecology and nothing else does. We probably don’t need to include a graph of ecology over time in the paper.

The endpoints are informative, though:

ggplot(data=nk_filter_pop_size_endpoints, aes(x=treatment, y=ecology, fill=treatment)) + geom_flat_violin(position = position_nudge(x = .2, y = 0)) + geom_point(aes(color = treatment), position = position_jitter(height = .1, width = .15), alpha = 0.8) + stat_summary(fun.data="mean_cl_boot", show.legend = FALSE) + scale_y_continuous("Ecology")  + scale_x_discrete("") + theme(legend.position="None")

ggsave("../figs/ecologyboxplots.png")

Stats

And now lets make sure that the things that look significantly different actually are. First we do a Kruskal-Wallis test to see if there are any differences among groups:

kruskal.test(ecology ~ treatment, data = nk_filter_pop_size_endpoints)
## 
##  Kruskal-Wallis rank sum test
## 
## data:  ecology by treatment
## Kruskal-Wallis chi-squared = 241.57, df = 8, p-value < 2.2e-16

Looks like there are. Now let’s do post-hoc Wilcox tests vs the base treatment with a Bonferonni correction for multiple comparisons to see which of these differences are actually significant.

p_vals <- data.frame(stringsAsFactors = FALSE)
base_ecology <- nk_filter_pop_size_endpoints %>% filter(treatment == "Base")
for (t in unique(as.character(nk_filter_pop_size_endpoints$treatment))) {
    treat_ecology <- nk_filter_pop_size_endpoints %>% filter(treatment == t)
    w <- wilcox.test(base_ecology$ecology, treat_ecology$ecology, exact = FALSE)
    p_vals <- as.data.frame(rbind(p_vals, list(t, w$p.value)), stringsAsFactors=FALSE)
    colnames(p_vals) <- c("treatment", "p")
    p_vals$treatment <- as.character(p_vals$treatment)
}

p_vals$sig <- p_vals$p < .05/length(p_vals$p)

p_vals
##                  treatment            p   sig
## 1 Oscillating\nenvironment 3.337107e-01 FALSE
## 2         Fitness\nsharing 9.102737e-12  TRUE
## 3         Small\npop\n(20) 1.000000e+00 FALSE
## 4     High\nmutation\n(.1) 1.000000e+00 FALSE
## 5             High K\n(10) 3.337107e-01 FALSE
## 6                     Base 1.000000e+00 FALSE
## 7       Large\npop\n(2000) 3.337107e-01 FALSE
## 8            High N\n(100) 3.337107e-01 FALSE
## 9    Low\nmutation\n(.005) 3.337107e-01 FALSE

As expected, only fitness sharing is significant.

Complexity

ggplot(data=nk_filter_pop_size) + stat_summary(fun.data = mean_cl_boot, aes(y=complexity, x=generation, color=treatment, fill=treatment), geom="smooth") + scale_x_continuous("Generation") + scale_y_continuous("Complexity") + scale_color_discrete("") + scale_fill_discrete("")

Obviously high N can have higher complexity and it approaches the maximum more gradually. It’s throwing off the scale, though, so let’s make a graph without it.

ggplot(data=nk_filter_pop_size %>% filter(treatment != "High N\n(100)")) + stat_summary(fun.data = mean_cl_boot, aes(y=complexity, x=generation, color=treatment, fill=treatment), geom="smooth") + scale_x_continuous("Generation") + scale_y_continuous("Complexity") + scale_color_discrete("") + scale_fill_discrete("")

Nothing super interesting about looking at this over time. As expected, the oscillating environment is periodically pushed away from maximum complexity. We can just graph the endpoints and get the interesting parts of the story which are that fitness sharing, small pop, and high mutation are generally lower.

ggplot(data=nk_filter_pop_size_endpoints %>% filter(treatment != "High N\n(100)"), aes(x=treatment, y=complexity, fill=treatment)) + geom_flat_violin(position = position_nudge(x = .2, y = 0)) + geom_point(aes(color = treatment), position = position_jitter(height = .1, width = .15), alpha = 0.8) + stat_summary(fun.data="mean_cl_boot", show.legend = FALSE) + scale_y_continuous("Complexity")  + scale_x_discrete("") + theme(legend.position="None")

ggsave("../figs/complexityboxplots.png")

Oh, right, the re-implemented version of the NK Landscape incorporates fitness sharing into the fitness function itself (i.e. the one that is used for the complexity calculation). As a result, complexity in fitness sharing isn’t actually decreased notably despite the fact that the population is presumably not sitting on an actual fitness peak in the NK Landscape. This is probably the correct way to measure it though, since fitness sharing does change the fitness landscape.

Stats

And now lets make sure that the things that look significantly different actually are. First we do a Kruskal-Wallis test to see if there are any differences among groups:

kruskal.test(complexity ~ treatment, data = nk_filter_pop_size_endpoints)
## 
##  Kruskal-Wallis rank sum test
## 
## data:  complexity by treatment
## Kruskal-Wallis chi-squared = 219.36, df = 8, p-value < 2.2e-16

Looks like there are. Now let’s do post-hoc Wilcox tests vs the base treatment with a Bonferonni correction for multiple comparisons to see which of these differences are actually significant.

p_vals <- data.frame(stringsAsFactors = FALSE)
base_complexity <- nk_filter_pop_size_endpoints %>% filter(treatment == "Base")
for (t in unique(as.character(nk_filter_pop_size_endpoints$treatment))) {
    treat_complexity <- nk_filter_pop_size_endpoints %>% filter(treatment == t)
    w <- wilcox.test(base_complexity$complexity, treat_complexity$complexity, exact = FALSE)
    p_vals <- as.data.frame(rbind(p_vals, list(t, w$p.value)), stringsAsFactors=FALSE)
    colnames(p_vals) <- c("treatment", "p")
    p_vals$treatment <- as.character(p_vals$treatment)
}

p_vals$sig <- p_vals$p < .05/length(p_vals$p)

p_vals
##                  treatment            p   sig
## 1 Oscillating\nenvironment 1.608021e-01 FALSE
## 2         Fitness\nsharing 5.570558e-01 FALSE
## 3         Small\npop\n(20) 4.692154e-02 FALSE
## 4     High\nmutation\n(.1) 1.672051e-09  TRUE
## 5             High K\n(10) 1.608021e-01 FALSE
## 6                     Base 1.000000e+00 FALSE
## 7       Large\npop\n(2000) 5.570558e-01 FALSE
## 8            High N\n(100) 4.161706e-14  TRUE
## 9    Low\nmutation\n(.005) 1.608021e-01 FALSE

Effect sizes

We’ve determined which changes are significant, but we should also determine whether they have a meaningful effect size. Since our groups will have different standard deviations, we use Glass’s Delta. ~.2 is considered low, ~.5 is considered meduium, .8 is considered high.

# Make data frame containing mean and standard deviation of all metrics for all treatmetns
summary_df <- nk_filter_pop_size_endpoints %>% group_by(treatment) %>% 
  summarise(mean_change = mean(change), 
            sd_change = sd(change), 
            mean_novelty = mean(novelty), 
            sd_novelty = sd(novelty), 
            mean_ecology = mean(ecology), 
            sd_ecology = sd(ecology), 
            mean_complexity = mean(complexity), 
            sd_complexity = sd(complexity))

# Take subset of that dataframe containing base treatment, for ease of calculating Glass's Delta
base_df <- summary_df %>% filter(treatment == "Base")

# Calculate effect sizes
effect_sizes <- summary_df %>% 
  transmute(treatment, 
            glass_change = glass_delta(mean_change, base_df$mean_change, base_df$sd_change), 
            glass_novelty = glass_delta(mean_novelty, base_df$mean_novelty, base_df$sd_novelty), 
            glass_ecology = glass_delta(mean_ecology, base_df$mean_ecology, base_df$sd_ecology), 
            glass_complexity = glass_delta(mean_complexity, base_df$mean_complexity, base_df$sd_complexity))

effect_sizes
## # A tibble: 9 x 5
##   treatment      glass_change glass_novelty glass_ecology glass_complexity
##   <fct>                 <dbl>         <dbl>         <dbl>            <dbl>
## 1 Base                 0             0            0                 0     
## 2 "Low\nmutatio…      -0.1          -0.0333      -0.0324            0.1   
## 3 "High\nmutati…       0.8           0.3         -0.00359          -2.63  
## 4 "Small\npop\n…       0.333         0.0333      -0.00177          -0.233 
## 5 "Large\npop\n…      -0.0667       -0.0333      -0.0324            0.0667
## 6 "Oscillating\…      -0.0333       -0.0333      -0.0324            0.1   
## 7 "High K\n(10)"      -0.0333       -0.0333      -0.0324            0.1   
## 8 "High N\n(100…      -0.1          -0.0333      -0.0324           80.1   
## 9 "Fitness\nsha…       0.633        -0.0333       0.913             0.0667

Okay, so those are generally medium-high for everything that is significant in the first place. Of course, the effect of high N on complexity is just ridiculous, as it should be.