mol evo. 2.2 Branch Support Flashcards

What are branch support measures testing?

Estimating precision, number on a phylogeny that shows how sure / how narrow our range is for a relationship on a phylogeny.

Example of gene phylogeny and language

gene phylogeny a lot more precise as compared to language

Bremer support value

find the very best tree that finds the best relationship, difference between best tree and tree that breaks us relationships, whats the best tree that separates the node? shows the difference between the best tree and a tree that doesn't have that relationship.

only in parsimony analysis

number represents the difference the best tree overall and the best tree that doesn't contain the group (?) (steps?)

the higher the number the higher the support

disadvantage: can't really compare one analysis from another

jackknife support value

sensitivity to taxa that are included, take out one species and redo the analysis to see similarity. test of sensitivity to taxon sampling. how well have I sampled my species to represent the species im interested in?

original in parsimony

taxon sampling

take out one species and see tree, how well does it match to original phylogeny

number on nodes represent percentage of relationships there / % of time relationships there

Total evidence

combine data of data set one and two, treat as one huge data set and get final result

consensus

data sets turned into trees individually and compared to each other to consensus tree

Do history of data sets always agree?

sometimes, but some examples may not agree. just because data sets results in saturated, complex answer doesn't mean it doesn't agree, rather may not be useful.

sampling error

migth be missing large chunks and we won't be able to see some connections

"Goldie lock zone"

preferred set of data, combined data can have some different genes, some conserved, we want a goldie lock zone

don't want it to be completely conserved, nor too different that it reaches the saturation point

Bootstrap support value

start with original population, then subsample to create new data set, each data set can be sampled more than once

sampling with replacement

way to test to see how much support

how accurate our data is across the entire range

start with data set lined up, homologous data set, sampled into alignment, resample to see how similar it is to result

1. randomly resample charaters with replacement to make new data set the same size as original

2. find best topology using new data set

3. repeat (replicates)

end of process: summarize the numbers on each node signifying each time this nose was found in pseudo replicated, 100 being highest

bootstrap support values represent the percentage of replicates that recovered each group.

can compare one analysis to another because -

Why is the bootstrap support value most widely used?

can be used with any analysis of data, widely applicable to all methodologies

Strengths

relatively easy, straight foward

no new data

Weaknesses / Drawbacks

estimate precision, not accuracy

tends to overestimate confidence

assumptions: independence

computational time

Bayesian

when bayesian analysis is done, part of process gives us support values

can be confident in results when compiled with parsimony and maximum likelihood analyses

when unsure, taxon sampling or increase data set!

Shimodaira-Hasegawa (SH) Test

in a statistical framework (maximum likelihood/ bayesian)

is a certain tree statistically different from the best tree? if so, don't need to worry about that tree

Sister taxon

closest outgroup to species you are studying