Day 7

Here we're given some sort of 'tower' which really looks like a tree graph and we're supposed to identify the base of the tree. Reading in and formatting the input will be a bit tricky:

In [2]:
import pandas as pd
import numpy as np
import sys

initial = open('/Users/Sven/py_files/aoc_2017/d7_input.txt')
initial = initial.readlines()
initial[:5]
Out[2]:
['mmqyju (156) -> rjzvwv, noybkx\n',
 'dvkug (90) -> tbjbz, gxnvl\n',
 'meeiw (95)\n',
 'iaibv (52)\n',
 'ckddz (36)\n']

Okay that seems like we can work with it. Now I think I'll want to compile this into a DataFrame since that will be easier to work with. Maybe we can write a function to format the input into three columns and then return a data frame.

In [3]:
def fmt_block(txt):
    # break off stuff after an arrow
    txt = txt.split('->')
    if len(txt) == 2:
        dependencies = txt[1]
    else:
        dependencies = ''
    dependencies = dependencies[:-1]
    dependencies = dependencies.replace(', ', ' ')

    left = txt[0]
    # get the name
    name = left[:left.find(' ')]

    # Now get the weight:
    wt = left[(left.find('(') + 1):left.find(')')]
    wt = int(wt)
    dat = pd.DataFrame({'name' : [name], 'weight' : [wt], 'dependencies' : [dependencies]})
    return(dat)

print(initial[1])
fmt_block(initial[1])
dvkug (90) -> tbjbz, gxnvl

Out[3]:
name weight dependencies
0 dvkug 90 tbjbz gxnvl

Sweet, now we just need to run this across the whole thing and concatinate the data sets

In [4]:
tower = [fmt_block(x) for x in initial]
tower = pd.concat(tower).reset_index(drop = True)
tower['removed'] = 0
tower.head()
Out[4]:
name weight dependencies removed
0 mmqyju 156 rjzvwv noybkx 0
1 dvkug 90 tbjbz gxnvl 0
2 meeiw 95 0
3 iaibv 52 0
4 ckddz 36 0
In [5]:
tower_save = tower.copy()

Well this is going swimmingly! Now we're going to have to do some kind of iterative process on the data set where we look for terminal nodes, go through the list, and remove that node from any dependencies.

In [6]:
def rmv_node():
    global tower
    # Find the node to remove
    node_removing = tower.loc[(tower.dependencies == '') & (tower.removed == 0), 'name']
    node_removing = node_removing.index[0]
    tower.loc[node_removing, 'removed'] = 1
    node_removing = tower.name[node_removing]  
    # Format the regular expression to remove:
    node_removing = '(^| )' + node_removing + '($| )'
    
    # Replace it from any dependencies
    tower['dependencies'] = tower.dependencies.replace(node_removing, ' ', regex = True)
    tower['dependencies'] = tower.dependencies.str.strip()
    

Now we just keep doing this until they're all gone but one

In [7]:
i = 0
while (sum(tower.removed) + 1) < len(tower):
    rmv_node()
    i += 1
    if i % 100 == 0:
        print(i)
100
200
300
400
500
600
700
800
900
1000
1100
1200
In [8]:
tower.removed.value_counts()
Out[8]:
1    1287
0       1
Name: removed, dtype: int64
In [9]:
tower[tower.removed == 0]
Out[9]:
name weight dependencies removed
103 svugo 32 0

Part 2

Now we have to make a change to the program where we look for 'balance' among the subtrees' weights. It seems like the hardest part of this will be finding nodes where all their dependencies are terminal branches...

In [10]:
tower_save['branches'] = tower_save.dependencies.str.strip()
tower_save['branches'] = tower_save.branches.str.count(' ') + 1
tower_save['total_weight'] = tower_save['weight']
tower_save_again = tower_save.copy()
tower_save.loc[tower_save.dependencies == '', 'branches'] = 0

Seems like we don't have any single branch connections at the end. Kinda wonder if we can work ourselves from the bottom up instead of top down...

In [11]:
tower_save[tower_save.name == 'svugo']
Out[11]:
name weight dependencies removed branches total_weight
103 svugo 32 xolvnpy gjxqx gtzxxav njorjq qpiklvf 0 5 32

Let's write a function to find the sub nodes given a name.

In [12]:
def find_subnodes(nam):
    dependencies = tower_save.loc[tower_save.name == nam,'dependencies']
    dependencies = dependencies.str.split().tolist()[0]
    subnodes = tower_save[tower_save.name.isin(dependencies)]
    return(subnodes)
find_subnodes('svugo')
Out[12]:
name weight dependencies removed branches total_weight
137 xolvnpy 22946 dfbeg ajrrm niuzt 0 3 22946
158 gjxqx 14 yruivis rizjob qsfwl asckjlv sfqwrge bncdhrm 0 6 14
206 qpiklvf 11870 weohqnq oifwih udamz 0 3 11870
557 njorjq 31504 ppgcgz tdqqjpg zuhfs lyxqyl 0 4 31504
1092 gtzxxav 8936 sxppnu jddjjw biibrbq ttldnhl 0 4 8936

Now we want to try and write a function for checking if the nodes are balanced:

In [13]:
def is_balanced(nam):
    global tower_save
    # first get subnodes:
    subnodes = find_subnodes(nam)
    
    # Now if any are any that have branches themselves, start calling is_balanced:
    branches = subnodes.loc[subnodes.branches != 0]
    if len(branches) > 0:
        for i in range(len(branches)):
            is_balanced(branches.name.iloc[i])
    
    # repull the subnodes when we return
    subnodes = find_subnodes(nam)
    
    # check if the subnodes are actually different
    if len(subnodes.total_weight.unique()) != 1:
        print('unbalanced at', nam)
        display(subnodes)
        raise KeyboardInterrupt

    # Now that all the subnodes have been checked, we set the total_weight
    # equal to the sum of the subnote weights plus the node's weight
    if len(subnodes) > 0:
        tower_save.loc[tower_save.name == nam, 'total_weight'] += sum(subnodes.total_weight)
    
    
In [14]:
is_balanced('svugo')
unbalanced at yruivis
name weight dependencies removed branches total_weight
171 oxipms 2533 njfaf okyya 0 2 2671
299 sphbbz 1161 gioryg scetry sfjspc ivpbt vyufo valbm oprfjt 0 7 2680
975 ggpau 91 mavpqd vtthqum qghnwdf uobgk muarank cgldm 0 6 2671
---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
<ipython-input-14-54b1cfee719e> in <module>
----> 1 is_balanced('svugo')

<ipython-input-13-82d807e50de3> in is_balanced(nam)
      8     if len(branches) > 0:
      9         for i in range(len(branches)):
---> 10             is_balanced(branches.name.iloc[i])
     11 
     12     # repull the subnodes when we return

<ipython-input-13-82d807e50de3> in is_balanced(nam)
      8     if len(branches) > 0:
      9         for i in range(len(branches)):
---> 10             is_balanced(branches.name.iloc[i])
     11 
     12     # repull the subnodes when we return

<ipython-input-13-82d807e50de3> in is_balanced(nam)
     17         print('unbalanced at', nam)
     18         display(subnodes)
---> 19         raise KeyboardInterrupt
     20 
     21     # Now that all the subnodes have been checked, we set the total_weight

KeyboardInterrupt: 

So now to find the aswer, we just have to take the weight of sphbbz and subtract the difference between its total weight and the total weight of oxipms.

In [ ]:
1161 - (2680 - 2671)

I'm kind of pleased with myself as usually when I try to write these recursive functions they just fail miserably. R also has a habit of crashing if you open to many function execution environment but python seems to be okay with it.