What are branch support measures testing?
Estimating precision, number on a phylogeny that shows how sure / how narrow our range is for a relationship on a phylogeny.
Example of gene phylogeny and language
gene phylogeny a lot more precise as compared to language
Bremer support value
find the very best tree that finds the best relationship, difference between best tree and tree that breaks us relationships, whats the best tree that separates the node? shows the difference between the best tree and a tree that doesn't have that relationship.
only in parsimony analysis
number represents the difference the best tree overall and the best tree that doesn't contain the group (?) (steps?)
the higher the number the higher the support
disadvantage: can't really compare one analysis from another
jackknife support value
sensitivity to taxa that are included, take out one species and redo the analysis to see similarity. test of sensitivity to taxon sampling. how well have I sampled my species to represent the species im interested in?
original in parsimony
taxon sampling
take out one species and see tree, how well does it match to original phylogeny
number on nodes represent percentage of relationships there / % of time relationships there
Total evidence
combine data of data set one and two, treat as one huge data set and get final result
consensus
data sets turned into trees individually and compared to each other to consensus tree
Do history of data sets always agree?
sometimes, but some examples may not agree. just because data sets results in saturated, complex answer doesn't mean it doesn't agree, rather may not be useful.
sampling error
migth be missing large chunks and we won't be able to see some connections
"Goldie lock zone"
preferred set of data, combined data can have some different genes, some conserved, we want a goldie lock zone
don't want it to be completely conserved, nor too different that it reaches the saturation point
Bootstrap support value
start with original population, then subsample to create new data set, each data set can be sampled more than once
sampling with replacement
way to test to see how much support
how accurate our data is across the entire range
start with data set lined up, homologous data set, sampled into alignment, resample to see how similar it is to result
1. randomly resample charaters with replacement to make new data set the same size as original
2. find best topology using new data set
3. repeat (replicates)
end of process: summarize the numbers on each node signifying each time this nose was found in pseudo replicated, 100 being highest
bootstrap support values represent the percentage of replicates that recovered each group.
can compare one analysis to another because -
Why is the bootstrap support value most widely used?
can be used with any analysis of data, widely applicable to all methodologies
Strengths
relatively easy, straight foward
no new data
Weaknesses / Drawbacks
estimate precision, not accuracy
tends to overestimate confidence
assumptions: independence
computational time
Bayesian
when bayesian analysis is done, part of process gives us support values
can be confident in results when compiled with parsimony and maximum likelihood analyses
when unsure, taxon sampling or increase data set!
Shimodaira-Hasegawa (SH) Test
in a statistical framework (maximum likelihood/ bayesian)
is a certain tree statistically different from the best tree? if so, don't need to worry about that tree
Sister taxon
closest outgroup to species you are studying