User’s Guide

ivy is designed to be used both as a library and interactively, in the IPython shell. IPython is an enhanced Python interpreter designed for interactive and exploratory scientific computing.

This screencast demonstrates some basic concepts:

Watch in full resolution (opens in new window)

Quickstart

Starting the shell

The -pylab option starts IPython in a mode that imports a number of functions from matplotlib and numpy, and allows interactive plotting.

$ ipython -pylab

In interactive mode, ivy provides some useful shortcut functions (e.g., readtree, readaln, treefig, alnfig) that you will typically want to import as follows.

In [1]: from ivy.interactive import *

Viewing a tree

Assuming you have started the shell,

In [2]: s = "examples/primates.newick"
In [3]: fig = treefig(s)

Here, the variable s can be a newick string, the name (path) of a file containing a newick tree, or an open file containing a newick string. Note that file paths are completed dynamically in ipython by hitting the <TAB> key, making it easy to find files with little typing.

A new window should appear, controlled by the variable fig. View the help for fig:

fig?

Trees in Ivy

Ivy does not have a tree class per se; rather trees in Ivy exist as collections of nodes. In Ivy, a Node is a class that contains information about a node. Nodes are rooted and recursively contain their children. Functions in Ivy act directly on Node objects. Nodes support Python idioms such as in, [, iteration, etc. This guide will cover how to read, view, navigate, modify, and write trees in Ivy.

Reading

You can read in trees using Ivy’s tree.read function. This function supports newick and nexus files. The tree.read function can take a file name, a file object, or a Newick string as input. The output of this function is the root node of the tree.

In [1]: import ivy
In [2]: f = open("examples/primates.newick")
In [3]: r = ivy.tree.read(f)
In [4]: f.close()
In [5]: r = ivy.tree.read("examples/primates.newick")
In [6]: r = ivy.tree.read(
            "((((Homo:0.21,Pongo:0.21)A:0.28,Macaca:0.49)"
            "B:0.13,Ateles:0.62)C:0.38,Galago:1.00)root;")
In [7]: # These three methods are identical

You can copy a read tree using the copy method on the root node. Node objects are mutable, so this method is preferred over r2 = r if you want to create a deep copy.

In [8]: r2 = r.copy(recurse=True) # If recurse=False, won't copy children etc.

Warning

As of now, the copy function does not produce a complete tree: the nodes are not properly connected to each other

In [9]: print r2["A"].parent
None

Viewing

There are a number of ways you can view trees in Ivy. For a simple display without needing to create a plot, Ivy can create ascii trees that can be printed to the console.

In [10]: print r.ascii # You can use the ascii method on root nodes.
                               ---------+ Homo
                      --------A+
             --------B+        ---------+ Pongo
             :        :
    --------C+        ------------------+ Macaca
    :        :
root+        ---------------------------+ Ateles
    :
    ------------------------------------+ Galago

For a more detailed and interactive tree, Ivy can create a plot using Matplotlib. More detail about visualization using Matplotlib will follow later in the guide.

In [11]: import ivy.vis
In [12]: fig = ivy.vis.tree.TreeFigure(r)
In [13]: fig.show()
_images/primate_mpl.png

You can also create a plot using Bokeh.

In [14]: import ivy.vis.bokehtree
In [15]: fig2 = ivy.vis.bokehtree.BokehTree(r)
In [16]: fig2.drawtree()
_images/primate_bokeh.png

Testing

We can test many attributes of a node in Ivy.

We can test whether a node contains another node

In [34]: r["A"] in r["C"]
Out[34]: True
In [35]: r["C"] in r["A"]
Out[35]: False
In [36]: r["C"] in r["C"]
Out[36]: True # Nodes contain themselves

We can test if a node is the root

In [37]: r.isroot
Out[37]: True
In [38]: r["C"].isroot
Out[38]: False

We can test if a node is a leaf

In [39]: r.isleaf
Out[39]: False
In [40]: r["Homo"].isleaf
Out[40]: True

We can test if a group of leaves is monophyletic

In [41]: r.ismono(r["Homo"], r["Pongo"])
Out[41]: True
In [42]: r.ismono(r["Homo"], r["Pongo"], r["Galago"])
Out[42]: False

Warning

ismono does not return an error if an internal node is given to it, but it does produce undesired results.

Modifying

The Ivy Node object has many methods for modifying a tree in place.

Removing

There are two main ways to remove nodes in Ivy; collapsing and pruning.

Collapsing removes a node and attaches its descendants to the node’s parent.

In [43]: r["A"].collapse()
In [44]: print r.ascii()
                            ------------+ Macaca
                            :
                -----------B+-----------+ Homo
                :           :
    -----------C+           ------------+ Pongo
    :           :
root+           ------------------------+ Ateles
    :
    ------------------------------------+ Galago

Pruning removes a node and its descendants

In [45]: cladeB = r["B"] # Store this node: we will add it back later
In [46]: r["B"].prune()
In [47]: print r.ascii()
    -----------------C+-----------------+ Ateles
root+
    ------------------------------------+ Galago

You can see that the tree now has a ‘knee’: clade C only has one child and does not need to exist on the tree. We can remove it with another method of removing nodes: excising. Excising removes a node from between its parent and its single child.

In [48]: r["C"].excise()
In [49]: print r.ascii()
    -------------------------------------+ Galago
root+
    -------------------------------------+ Ateles

It is important to note that although the tree has changed, the nodes in the tree retain some of their original attributes, including their indices:

In [50]: r[0]
Out[50]: Node(140144821291920, root, 'root')
In [51]: r[1] # Node 1 ("C") no longer exists
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)

IndexError: 1
In [52]: r[7] # You can access existing nodes with their original indices
Out[52]: Node(140144821292368, leaf, 'Ateles')

To recap: #. collapse removes a node and adds its descendants to its parent #. prune removes a node and also removes its descendants #. excise removes ‘knees’

Adding

Our tree is looking a little sparse, so let’s add some nodes back in. There are a few methods of adding nodes in Ivy. We will go over biscect, add_child, and graft

Bisecting creates a ‘knee’ node halfway between a parent and a child.

In [53]: r["Galago"].bisect_branch()
Out[53]: Node(140144821654480)
In [54]: print r.ascii
    ------------------------------------+ Ateles
root+
    ------------------+-----------------+ Galago

We now have a brand new node. We can set some of its attributes, including its label.

Note: we cannot access this new node by using node indicies (that is, r[1], etc.). We also cannot use its label because it has none. We’ll access it using its ID instead (if you’re following along, your ID will be different).

In [55]: r[140144821654480].label = "N"

Now let’s add a node as a child of N. We can do this using the add_child method.

In [56]: r["N"].add_child(cladeB["Homo"])
In [57]: print r.ascii()
    ------------------------------------+ Ateles
root+
    :                 ------------------+ Galago
    -----------------N+
                      ------------------+ Homo

We can also add nodes with graft. graft adds a node as a sibling to the current node. In doing so, it also adds a new node as parent to both nodes.

In [58]: r["Ateles"].graft(cladeB["Macaca"])
In [59]: r["Galago"].graft(cladeB["Pongo"])
In [60]: print r.ascii()
                ------------------------+ Homo
    -----------N+
    :           :           ------------+ Galago
    :           ------------+
root+                       ------------+ Pongo
    :
    :                       ------------+ Ateles
    ------------------------+
                            ------------+ Macaca

To recap:

  1. bisect_branch adds ‘knees’
  2. add_child adds a node as a child to the current node
  3. graft adds a node as a sister to the current node, and also adds a parent.

Ladderizing

Ladderizing non-destructively changes the tree so that it has a nicer-looking output when drawn. It orders the clades by size.

In [61]: r.ladderize()
In [62]: print r.ascii()
                            ------------+ Ateles
    ------------------------+
    :                       ------------+ Macaca
root+
    :           ------------------------+ Homo
    -----------N+
                :           ------------+ Galago
                ------------+
                            ------------+ Pongo

Rerooting

Warning

Currently does not work properly.

In [63]: r.reroot(r["N"])
In [64]: r.descendants() # Missing descendants
Out[64]:
[Node(140144821839696),
 Node(140144821839120, leaf, 'Ateles'),
 Node(140144821839056, leaf, 'Macaca')]
In [65]: print r.ascii() # Raises a KeyError

Writing

Once you are done modifying your tree, you will probably want to save it. You can save your trees with the write function. This function takes a root node and an open file object as inputs. This function can currently only write in newick format.

In [66]: f = open("examples/primates_mangled.newick", "w")
In [67]: ivy.tree.write(r, outfile = f)
In [68]: f.close()

Using Treebase

ivy has functions to pull trees from Treebase.

Fetching the study

If you have an id for a study on treebase, you can fetch the study and access the trees contained within the study.

In [69]: import ivy
In [70]: from ivy.treebase import fetch_study
In [71]: study_id = "1411" # The leafy cactus genus Pereskia
In [72]: e = fetch_study(study_id, 'nexml') # e is an lxml etree

Accessing the tree

You can parse the output of fetch_study using the parse_nexml function,
then access the tree(s) contained within the study.
In [73]: from ivy.treebase import parse_nexml
In [74]: x = parse_nexml(e) # x is an ivy Storage object
In [75]: r = x.trees[0].root
In [76]: from ivy.interactive import treefig
In [77]: fig = treefig(r)

Visualization with Matplotlib

ivy supports interactive tree visualization with Matplotlib.

Displaying a tree is very simple

In [78]: from ivy.interactive import *
In [79]: r = ivy.tree.read("examples/primates.newick")
In [80]: fig = treefig(r)
_images/visualization_1.png

A tree figure by default consists of the tree with clade and leaf labels and a navigation toolbar. The navigation toolbar allows zooming and panning. Panning can be done by clicking with the middle mouse button, using the arrow keys, or using the pan tool on the toolbar. Zooming can be done using the scroll wheel, the plus and minus keys, or the ‘zoom to rectangle’ tool in the toolbar. Press t to return default zoom level.

Larger trees are shown with a split overview pane as well, which can be toggled with the toggle_overview method.

In [81]: fig.toggle_overview()
_images/visualization_2.png

You can retrieve information about a node or group of nodes by selecting them (selected nodes have green circles on them) and accessing the selected nodes

In [82]: fig.selected
Out [83]:
[Node(139976891981456, leaf, 'Homo'),
 Node(139976891981392, 'A'),
 Node(139976891981520, leaf, 'Pongo')]
_images/visualization_3.png

You can also select nodes from the command line. Entering an internal node will select that node and all of its descendants.

In [84]: fig.select_nodes(r["C"])
_images/visualization_4.png

You can highlight certain branches using the highlight method. Again, entering an internal node will highlight that node and its descendants. This also highlights the branches on the overview.

In [85]: fig.highlight(r["B"])
_images/visualization_5.png

You can also decorate the tree with various symbols using the decorate method. decorate can be called with any function from ivy.symbols.

In [86]: import ivy.vis.symbols
In [87]: fig.redraw() # This clears the plot
In [88]: fig.decorate(ivy.vis.symbols.circles, r.leaves(),
        colors = ["red", "orange", "yellow", "green", "blue"])
_images/visualization_6.png

Performing analyses

ivy has many tools for performing analyses on trees. Here we will cover a few analyses you can perform.

Phylogenetically Independent Contrasts

You can perform PICs using ivy’s PIC function. This function takes a root node and a dictionary mapping node labels to character traits as inputs and outputs a dictionary mappinginternal nodes to tuples containing ancestral state, its variance (error), the contrast, and the contrasts’s variance.

Note: This function requires that the root node have a length that is not none. Note: this function currently cannot handle polytomies.

In [89]: import ivy
In [90]: r = ivy.tree.read("examples/primates.newick")
In [91]: r.length = 0.0 # Setting the root length to 0
In [92]: char1 = {
                "Homo": 4.09434,
                "Pongo": 3.61092,
                "Macaca": 2.37024,
                "Ateles": 2.02815,
                "Galago": -1.46968
                }
In [93]: c = ivy.contrasts.PIC(r, char1)
In [94]: for k,v in c.items():
            print k.label, v
root (1.1837246133953971, 0.3757434703904836, 4.25050357912179, 1.6019055509527755)
A (3.85263, 0.385, 0.48341999999999974, 0.42)
B (3.2003784000000004, 0.3456, 1.48239, 0.875)
C (2.78082357912179, 0.6019055509527755, 1.1722284000000003, 0.9656)

Lineages Through Time

ivy has functions for computing LTTs. The ltt function takes a root node as input and returns a tuple of 1D-arrays containing the results of times and diverisities.

Note: The tree is expected to be an ultrametric chromogram (extant leaves, branch lengths proportional to time).

In [95]: import ivy
In [96]: r = ivy.tree.read("examples/primates.newick")
In [97]: v = ivy.ltt(r)
In [98]: print r.ascii()
                               ---------+ Homo
                      --------A+
             --------B+        ---------+ Pongo
             :        :
    --------C+        ------------------+ Macaca
    :        :
root+        ---------------------------+ Ateles
    :
    ------------------------------------+ Galago
In [99]: for i in l:
            print i
[ 0.    0.38  0.51  0.79]
[ 1.  2.  3.  4.]

You can plot your results using Matplotlib.

In [100]: import matplotlib.pyplot as plt
In [101]: plt.step(v[0], v[1])
In [102]: plt.margins(.2, .2)
In [103]: plt.xlabel("Time"); plt.ylabel("Lineages"); plt.title("LTT")
In [104]: plt.show()
_images/ltt.png

Phylorate plot

By accessing R libraries using rpy2, we can use the functions in the BAMMtools R library to generate phylorate plots.

The following analysis is done using the whales dataset provided with BAMMtools.

The first step is to read in the data and then import and use the necessary R functions to get the rate data for each branch.

In [105]: from rpy2.robjects.packages import importr
In [106]: import numpy as np
In [107]: from ivy.interactive import *
In [108]: e = "whaleEvents.csv" # Event data created with BAMM
In [109]: treefile = "whales.tre"
In [110]: ape = importr('ape')
In [111]: bamm = importr('BAMMtools')
In [112]: rutils = importr('utils')
In [113]: events = rutils.read_csv(e)
In [114]: tree = ape.read_tree(treefile)
In [115]: edata = bamm.getEventData(tree, eventdata=e, burnin=0.2)
In [116]: dtrates = bamm.dtRates(edata, 0.01, tmat=True).rx2('dtrates')
In [117]: nodeidx = np.array(dtrates.rx2('tmat').rx(True, 1), dtype=int)
In [118]: rates = np.array(dtrates.rx2('rates'))
In [119]: netdiv = rates[0]-rates[1]

Now we are done using R functions. The rest can be done in Python.

The next step is to read in the tree with ivy and then assign the Ape node indicies. Ape numbers nodes as following: for a tree with n leaves, the leaves and numbered 1:n in the order they appear in their file. The internal nodes are ordered in preorder sequence, starting with the root node as node n+1.

In [120]: r = ivy.tree.read(treefile, type="newick")
In [121]: i = 1
In [122]: for lf in r.leaves():
        lf.apeidx = i
        i += 1
In [123]: for n in r.clades():
        n.apeidx = i
        i += 1
In [124]: f = treefig(r)

Now we can generate the plot by drawing individual segments of each branch, color-coded by rate along the branch.

In [125]: for n in r.descendants():

n.rates = netdiv[nodeidx==n.apeidx] c = f.detail.n2c[n] pc = f.detail.n2c[n.parent] seglen = (c.x-pc.x)/len(n.rates) for i, rate in enumerate(n.rates):

x0 = pc.x+i*seglen x1 = x0+seglen segments.append(((x0, c.y), (x1, c.y))) values.append(rate)

segments.append(((pc.x, pc.y), (pc.x, c.y))) values.append(n.rates[0])

In [126]: from matplotlib.cm import coolwarm In [127]: from matplotlib.collections import LineCollection In [128]: lc = LineCollection(segments, cmap=coolwarm, lw=2) In [129]: lc.set_array(np.array(values)) In [130]: f.detail.add_collection(lc) In [131]: f.figure.canvas.draw_idle()

_images/phylorate_plot.png