{ "cells": [ { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "# Comparison Between TreeValue and DM-Tree" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "In this section, we will compare the feature and performance of the [dm-tree](https://github.com/deepmind/tree) library, which is developed by deepmind." ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "Before starting the comparison, let us define some thing." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "execution": { "iopub.execute_input": "2024-10-16T14:03:14.453962Z", "iopub.status.busy": "2024-10-16T14:03:14.453752Z", "iopub.status.idle": "2024-10-16T14:03:14.461586Z", "shell.execute_reply": "2024-10-16T14:03:14.460941Z" }, "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "origin = {'a': 1, 'b': 2, 'c': {'x': 3, 'y': 4}}" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "## Mapping Operation" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "Mapping operation is quite common in the processing of trees. A mapping function should be provided to create a new tree based on the mapped tree." ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "### TreeValue's mapping" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "In TreeValue, mapping is provided to simply create a new tree." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "execution": { "iopub.execute_input": "2024-10-16T14:03:14.464169Z", "iopub.status.busy": "2024-10-16T14:03:14.463792Z", "iopub.status.idle": "2024-10-16T14:03:14.505116Z", "shell.execute_reply": "2024-10-16T14:03:14.504442Z" }, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": [ "\n", "├── 'a' --> 1\n", "├── 'b' --> 2\n", "└── 'c' --> \n", " ├── 'x' --> 3\n", " └── 'y' --> 4" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from treevalue import FastTreeValue, mapping\n", "\n", "tv = FastTreeValue(origin)\n", "tv" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "execution": { "iopub.execute_input": "2024-10-16T14:03:14.541147Z", "iopub.status.busy": "2024-10-16T14:03:14.540559Z", "iopub.status.idle": "2024-10-16T14:03:14.545938Z", "shell.execute_reply": "2024-10-16T14:03:14.545359Z" }, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": [ "\n", "├── 'a' --> 2\n", "├── 'b' --> 4\n", "└── 'c' --> \n", " ├── 'x' --> 6\n", " └── 'y' --> 8" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mapping(tv, lambda x: x * 2)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "Here is the performance test." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "execution": { "iopub.execute_input": "2024-10-16T14:03:14.548215Z", "iopub.status.busy": "2024-10-16T14:03:14.547756Z", "iopub.status.idle": "2024-10-16T14:03:28.324978Z", "shell.execute_reply": "2024-10-16T14:03:28.324333Z" }, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1.7 µs ± 5.28 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)\n" ] } ], "source": [ "%timeit mapping(tv, lambda x: x * 2)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "In order to support the cased that the mapped value is related to both path and value of each node, we can use the 'path mapping mode' by simply use the second parameter of the mapping function." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "execution": { "iopub.execute_input": "2024-10-16T14:03:28.327372Z", "iopub.status.busy": "2024-10-16T14:03:28.326934Z", "iopub.status.idle": "2024-10-16T14:03:28.332097Z", "shell.execute_reply": "2024-10-16T14:03:28.331542Z" }, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": [ "\n", "├── 'a' --> ('path', ('a',), 'value', 1)\n", "├── 'b' --> ('path', ('b',), 'value', 2)\n", "└── 'c' --> \n", " ├── 'x' --> ('path', ('c', 'x'), 'value', 3)\n", " └── 'y' --> ('path', ('c', 'y'), 'value', 4)" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mapping(tv, lambda x, p: ('path', p, 'value', x))" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "And here is the performance" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "execution": { "iopub.execute_input": "2024-10-16T14:03:28.334323Z", "iopub.status.busy": "2024-10-16T14:03:28.333839Z", "iopub.status.idle": "2024-10-16T14:03:42.314817Z", "shell.execute_reply": "2024-10-16T14:03:42.314053Z" }, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1.72 µs ± 8.01 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)\n" ] } ], "source": [ "%timeit mapping(tv, lambda x, p: ('path', p, 'value', x))" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "### DM-Tree's mapping" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "In DM-Tree, mapping operation is supported by [map_structure](https://tree.readthedocs.io/en/latest/api.html#tree.map_structure) function." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "execution": { "iopub.execute_input": "2024-10-16T14:03:42.317314Z", "iopub.status.busy": "2024-10-16T14:03:42.316906Z", "iopub.status.idle": "2024-10-16T14:03:42.322182Z", "shell.execute_reply": "2024-10-16T14:03:42.321516Z" }, "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "from tree import map_structure" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "execution": { "iopub.execute_input": "2024-10-16T14:03:42.324316Z", "iopub.status.busy": "2024-10-16T14:03:42.323945Z", "iopub.status.idle": "2024-10-16T14:03:42.328756Z", "shell.execute_reply": "2024-10-16T14:03:42.328236Z" }, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": [ "{'a': 2, 'b': 4, 'c': {'x': 6, 'y': 8}}" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "map_structure(lambda x: x * 2, origin)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "This is the performance of `map_structure`, obviously much slower than `mapping` in TreeValue." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "execution": { "iopub.execute_input": "2024-10-16T14:03:42.330911Z", "iopub.status.busy": "2024-10-16T14:03:42.330529Z", "iopub.status.idle": "2024-10-16T14:03:52.418796Z", "shell.execute_reply": "2024-10-16T14:03:52.418148Z" }, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "12.4 µs ± 78.1 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)\n" ] } ], "source": [ "%timeit map_structure(lambda x: x * 2, origin)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "To supported the second situation in the last section, [map_structure_with_path](https://tree.readthedocs.io/en/latest/api.html#tree.map_structure_with_path) can be used." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "execution": { "iopub.execute_input": "2024-10-16T14:03:52.421325Z", "iopub.status.busy": "2024-10-16T14:03:52.420726Z", "iopub.status.idle": "2024-10-16T14:03:52.424108Z", "shell.execute_reply": "2024-10-16T14:03:52.423457Z" }, "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "from tree import map_structure_with_path" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "execution": { "iopub.execute_input": "2024-10-16T14:03:52.426516Z", "iopub.status.busy": "2024-10-16T14:03:52.426011Z", "iopub.status.idle": "2024-10-16T14:03:52.432436Z", "shell.execute_reply": "2024-10-16T14:03:52.431861Z" }, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": [ "{'a': ('path', ('a',), 'value', 1),\n", " 'b': ('path', ('b',), 'value', 2),\n", " 'c': {'x': ('path', ('c', 'x'), 'value', 3),\n", " 'y': ('path', ('c', 'y'), 'value', 4)}}" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "map_structure_with_path(lambda path, x: ('path', path, 'value', x), origin)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "Here is the performance." ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "execution": { "iopub.execute_input": "2024-10-16T14:03:52.434639Z", "iopub.status.busy": "2024-10-16T14:03:52.434234Z", "iopub.status.idle": "2024-10-16T14:03:54.675235Z", "shell.execute_reply": "2024-10-16T14:03:54.674605Z" }, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "27.6 µs ± 124 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)\n" ] } ], "source": [ "%timeit map_structure_with_path(lambda path, x: ('path', path, 'value', x), origin)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "## Flatten and Unflatten" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "In tree operations, flatten is often used to linearly expand the tree structure for operations such as parallel processing. Based on flatten, unflatten is its inverse operation, which can recover the tree structure from the linear data." ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "### TreeValue's Performance" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "In TreeValue, flatten and unflatten are provided, which usage are simple." ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "execution": { "iopub.execute_input": "2024-10-16T14:03:54.677914Z", "iopub.status.busy": "2024-10-16T14:03:54.677420Z", "iopub.status.idle": "2024-10-16T14:03:54.682559Z", "shell.execute_reply": "2024-10-16T14:03:54.681961Z" }, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": [ "\n", "├── 'a' --> 1\n", "├── 'b' --> 2\n", "└── 'c' --> \n", " ├── 'x' --> 3\n", " └── 'y' --> 4" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from treevalue import FastTreeValue, flatten, unflatten\n", "\n", "origin_tree = FastTreeValue(origin)\n", "origin_tree" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "execution": { "iopub.execute_input": "2024-10-16T14:03:54.684773Z", "iopub.status.busy": "2024-10-16T14:03:54.684395Z", "iopub.status.idle": "2024-10-16T14:03:54.688969Z", "shell.execute_reply": "2024-10-16T14:03:54.688434Z" }, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": [ "[(('a',), 1), (('b',), 2), (('c', 'x'), 3), (('c', 'y'), 4)]" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "flatted = flatten(origin_tree)\n", "flatted" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "Here is the performance of `flatten`" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "execution": { "iopub.execute_input": "2024-10-16T14:03:54.691078Z", "iopub.status.busy": "2024-10-16T14:03:54.690867Z", "iopub.status.idle": "2024-10-16T14:03:58.837887Z", "shell.execute_reply": "2024-10-16T14:03:58.837160Z" }, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "511 ns ± 2.29 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)\n" ] } ], "source": [ "%timeit flatten(origin_tree)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "The tree can be re-created from `flatted` with function `unflatten`." ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "execution": { "iopub.execute_input": "2024-10-16T14:03:58.840086Z", "iopub.status.busy": "2024-10-16T14:03:58.839873Z", "iopub.status.idle": "2024-10-16T14:03:58.844195Z", "shell.execute_reply": "2024-10-16T14:03:58.843663Z" }, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": [ "\n", "├── 'a' --> 1\n", "├── 'b' --> 2\n", "└── 'c' --> \n", " ├── 'x' --> 3\n", " └── 'y' --> 4" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "unflatten(flatted)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "And here is the performance." ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "execution": { "iopub.execute_input": "2024-10-16T14:03:58.846487Z", "iopub.status.busy": "2024-10-16T14:03:58.845882Z", "iopub.status.idle": "2024-10-16T14:04:03.640142Z", "shell.execute_reply": "2024-10-16T14:04:03.639364Z" }, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "590 ns ± 0.628 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)\n" ] } ], "source": [ "%timeit unflatten(flatted)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "### DM-Tree's Performance" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "[Flatten](https://tree.readthedocs.io/en/latest/api.html#tree.flatten) is provided in DM-Tree as well, but it differs from that in TreeValue, as the following" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "execution": { "iopub.execute_input": "2024-10-16T14:04:03.642697Z", "iopub.status.busy": "2024-10-16T14:04:03.642260Z", "iopub.status.idle": "2024-10-16T14:04:03.646975Z", "shell.execute_reply": "2024-10-16T14:04:03.646406Z" }, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": [ "[1, 2, 3, 4]" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from tree import flatten\n", "\n", "flatten(origin)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "Here is the performance" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "execution": { "iopub.execute_input": "2024-10-16T14:04:03.649193Z", "iopub.status.busy": "2024-10-16T14:04:03.648785Z", "iopub.status.idle": "2024-10-16T14:04:09.692964Z", "shell.execute_reply": "2024-10-16T14:04:09.692215Z" }, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "745 ns ± 2.51 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)\n" ] } ], "source": [ "%timeit flatten(origin)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "The structure of the tree is dropped, only the data is extracted from the tree. This means the `flatten` function in DM-Tree is irreversible, we can not recover the original tree with the result above." ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "The reversible resolution provided in DM-Tree is [flatten_with_path](https://tree.readthedocs.io/en/latest/api.html#tree.flatten_with_path)" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "execution": { "iopub.execute_input": "2024-10-16T14:04:09.695584Z", "iopub.status.busy": "2024-10-16T14:04:09.695175Z", "iopub.status.idle": "2024-10-16T14:04:09.699941Z", "shell.execute_reply": "2024-10-16T14:04:09.699274Z" }, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": [ "[(('a',), 1), (('b',), 2), (('c', 'x'), 3), (('c', 'y'), 4)]" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from tree import flatten_with_path\n", "\n", "flatten_with_path(origin)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "Here is the performance, much slower than `flatten` in TreeValue." ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "execution": { "iopub.execute_input": "2024-10-16T14:04:09.702280Z", "iopub.status.busy": "2024-10-16T14:04:09.701893Z", "iopub.status.idle": "2024-10-16T14:04:20.375389Z", "shell.execute_reply": "2024-10-16T14:04:20.374779Z" }, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "13.2 µs ± 153 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)\n" ] } ], "source": [ "%timeit flatten_with_path(origin)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "To re-create the original tree, we need a tree structure with any objects filled inside. Use the [unflatten_as](https://tree.readthedocs.io/en/latest/api.html#tree.unflatten_as) to archive this goal." ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "execution": { "iopub.execute_input": "2024-10-16T14:04:20.377665Z", "iopub.status.busy": "2024-10-16T14:04:20.377348Z", "iopub.status.idle": "2024-10-16T14:04:20.382352Z", "shell.execute_reply": "2024-10-16T14:04:20.381799Z" }, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": [ "{'a': 1, 'b': 2, 'c': {'x': 3, 'y': 4}}" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from tree import unflatten_as\n", "\n", "unflatten_as({'a': None, 'b': None, 'c': {'x': None, 'y': None}}, [1, 2, 3, 4])" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "Here is the performance." ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "execution": { "iopub.execute_input": "2024-10-16T14:04:20.384461Z", "iopub.status.busy": "2024-10-16T14:04:20.384063Z", "iopub.status.idle": "2024-10-16T14:04:28.649960Z", "shell.execute_reply": "2024-10-16T14:04:28.649227Z" }, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "10.2 µs ± 16.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)\n" ] } ], "source": [ "%timeit unflatten_as({'a': None, 'b': None, 'c': {'x': None, 'y': None}}, [1, 2, 3, 4])" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "### Positional Replacement" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "It is obvious that the `unflatten_as` in DM-Tree is quite different from `unflatten` in TreeValue, for the former one is `replacing` and the latter one is `constructing`. This means the `unflatten_as` in DM-Tree may supported some more features, such as creating a new tree with another tree's structure and the given values. So we need an experiment on this, called 'positional replacement'." ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "First, in TreeValue, we can build a function named `replace` to realize this." ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "execution": { "iopub.execute_input": "2024-10-16T14:04:28.652362Z", "iopub.status.busy": "2024-10-16T14:04:28.651955Z", "iopub.status.idle": "2024-10-16T14:04:28.655708Z", "shell.execute_reply": "2024-10-16T14:04:28.655132Z" }, "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "from treevalue import flatten, unflatten\n", "\n", "def replace(t, v):\n", " pairs = flatten(t)\n", " return unflatten([(path, vi) for (path, _), vi in zip(pairs, v)])" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "Create a new tree based on `origin_tree`'s structure and new values." ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "execution": { "iopub.execute_input": "2024-10-16T14:04:28.657700Z", "iopub.status.busy": "2024-10-16T14:04:28.657338Z", "iopub.status.idle": "2024-10-16T14:04:28.661861Z", "shell.execute_reply": "2024-10-16T14:04:28.661222Z" }, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": [ "\n", "├── 'a' --> 3\n", "├── 'b' --> 5\n", "└── 'c' --> \n", " ├── 'x' --> 7\n", " └── 'y' --> 9" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "replace(origin_tree, [3, 5, 7, 9])" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "Here is the performance." ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "execution": { "iopub.execute_input": "2024-10-16T14:04:28.664108Z", "iopub.status.busy": "2024-10-16T14:04:28.663741Z", "iopub.status.idle": "2024-10-16T14:04:43.702358Z", "shell.execute_reply": "2024-10-16T14:04:43.701697Z" }, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1.85 µs ± 3.24 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)\n" ] } ], "source": [ "%timeit replace(origin_tree, [3, 5, 7, 9])" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "In DM-Tree, `unflatten_as` can be directly used." ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "execution": { "iopub.execute_input": "2024-10-16T14:04:43.704685Z", "iopub.status.busy": "2024-10-16T14:04:43.704243Z", "iopub.status.idle": "2024-10-16T14:04:43.709187Z", "shell.execute_reply": "2024-10-16T14:04:43.708639Z" }, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": [ "{'a': 3, 'b': 5, 'c': {'x': 7, 'y': 9}}" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from tree import unflatten_as\n", "\n", "unflatten_as(origin, [3, 5, 7, 9])" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "Here is the performance, even much slower than the `replace` function for integration." ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "execution": { "iopub.execute_input": "2024-10-16T14:04:43.711194Z", "iopub.status.busy": "2024-10-16T14:04:43.710870Z", "iopub.status.idle": "2024-10-16T14:04:51.669748Z", "shell.execute_reply": "2024-10-16T14:04:51.669106Z" }, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "9.81 µs ± 44.5 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)\n" ] } ], "source": [ "%timeit unflatten_as(origin, [3, 5, 7, 9])" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "## Conclusion" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "The mapping operation is supported by both library, and `mapping` in TreeValue's performance is significantly higher than the `map_structure` and `map_structure_with_path` in DM-Tree." ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "The `flatten` and `unflatten` in TreeValue is reversible, but in DM-Tree not. DM-Tree's performance on flatten and unflatten operation is lower than that in TreeValue in all aspects." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.18" } }, "nbformat": 4, "nbformat_minor": 0 }