How to solve Advent of Code 2022 - Day 7 with Python
If you missed any previous days, click here for all my content about that: Advent of Code, if you want to know why you should participate and try to code the solution for yourself, click here: Advent Of Code 2022 – 7 Reasons why you should participate. If you're here to see the solution of Day 7, continue reading ;)
GitHub Repository
https://github.com/GalaxyInfernoCodes/Advent_Of_Code_2022
I will upload all of my solutions there - in the form of Python (or Scala alternatives) notebooks. You have to use your own input for the "input.txt" since everyone gets their own and everyone needs to provide a different solution based on their input.
Day 7 Puzzle
On day 7 of Advent of Code, we had to build a tree of a filesystem that we get knowledge of through recorded terminal commands, such as this:
$ cd /
$ ls
dir a
14848514 b.txt
8504156 c.dat
dir d
$ cd a
$ ls
dir e
29116 f
2557 g
62596 h.lst
$ cd e
$ ls
584 i
$ cd ..
$ cd ..
$ cd d
$ ls
4060174 j
8033020 d.log
5626152 d.ext
7214296 k
Which would generate the following tree:
- / (dir)
- a (dir)
- e (dir)
- i (file, size=584)
- f (file, size=29116)
- g (file, size=2557)
- h.lst (file, size=62596)
- b.txt (file, size=14848514)
- c.dat (file, size=8504156)
- d (dir)
- j (file, size=4060174)
- d.log (file, size=8033020)
- d.ext (file, size=5626152)
- k (file, size=7214296)
Part 1
Here we then had to calculate the size of the folders to then add up the ones with a total size with less than 100,000. Now you could do this in a shortcut-way without reconstructing the complete tree. However, I chose to train my data structures and build that tree from top to bottom - or rather, from root to leaves.
Tree Nodes
First off, I created a class for the entries in the tree. This would either be a file or a directory, indicated by a boolean value and if it's a file, it has a size.
At the start the node does not have children or a parent. You can however add a child through the method **add_child**
which assigns the child a parent before adding it to the children
list.
Then I calculated the size recursively, always adding the size of the children to the current size until I arrive at a file (which by its nature does not have children), in which case I return just the file size.
I also included a print method for the tree. Frankly because debugging without being able to see the tree was nearly impossible.
class TreeNode:
def __init__(self, is_dir: bool, name, size=None) -> None:
self.is_dir = is_dir
self.name = name
self.size = size
self.children = []
self.parent = None
def add_child(self, child):
child.parent = self
self.children.append(child)
def get_size(self):
if self.is_dir:
total_size = 0
for child in self.children:
total_size += child.get_size()
return total_size
else:
return self.size
def print_children(self, level):
if self.is_dir:
print('--' * level + self.name + ' (total=' + str(self.get_size()) +')')
else:
print('--' * level + self.name + ' (file=' + str(self.get_size()) +')')
if len(self.children) > 0:
for child in self.children:
child.print_children(level+1)
Tree
I also wrote a class for the tree. You don't have to do that, most of the methods are one liners and its main purpose is keeping track of the current directory/node. You could easily do that in your main loop.
class Tree:
def __init__(self) -> None:
self.root = TreeNode(is_dir=True, name="root")
self.current = self.root
def reset_to_root(self):
self.current = self.root
def go_up_one_level(self):
self.current = self.current.parent
def go_to_child(self, name):
self.current = list(filter(lambda child: child.name == name, self.current.children))[0]
def add_new_child(self, child):
self.current.add_child(child)
Parsing the tree
There are a few different cases of lines:
- '$ cd /' which returns us to the root directory
- '$ ls' which starts the sequence of listing files and directories within the current ones
- here I use a while loop to catch all of those and add them as children
- '$ cd ..' simply goes up to the parent directory/node of the current one
- '$ cd <name>' which enters the given sub-directory
- side note: thankfully the (or my) input never tries to enter directories that you didn't previously see
with open('example.txt', 'r') as f:
lines = f.readlines()
lines = [entry.strip() for entry in lines]
tree = Tree()
while len(lines) > 0:
line = lines.pop(0)
if line == '$ cd /':
tree.reset_to_root()
elif line == '$ ls':
while len(lines)>0 and '$' not in lines[0]:
line = lines.pop(0)
size, name = line.split(' ')
if size.isdigit():
new_node = TreeNode(is_dir=False, name=name, size=int(size))
else:
new_node = TreeNode(is_dir=True, name=name)
tree.add_new_child(new_node)
elif line == '$ cd ..':
tree.go_up_one_level()
elif '$ cd' in line:
_, _, name = line.split(' ')
tree.go_to_child(name)
Putting it together
This is the method (within the TreeNode class), that solves Part 1:
def find_subdirectories_part1(self):
dir_sizes = 0
if self.is_dir:
for child in self.children:
if child.is_dir and child.get_size() <= 100000:
dir_sizes += child.get_size() + child.find_subdirectories_part1()
else:
dir_sizes += child.find_subdirectories_part1()
return dir_sizes
Here we recursively go through the tree building up a sum of file sizes as we go. When iterating through a nodes children, we only add the current directory size to our sum, if it's below the 100,000 threshold. Otherwise we just look deeper in the tree.
# call on root node to solve
tree.root.find_subdirectories_part1()
Part 2
After building up a whole tree in Part 1, Part 2 is fairly quick after reading the assignment. Basically, we are looking for a directory that is at least as big as 30000000 - (70000000 - tree.root.get_size()).
To do that I created a similar method as in Part 1. This time, I'm not building a sum, but instead collecting all the possible directory sizes in an array.
# method in TreeNode class
def find_subdirectories_part2(self, min_size):
dir_sizes = []
if self.is_dir:
for child in self.children:
if child.is_dir and child.get_size() >= min_size:
dir_sizes += [child.get_size()] + child.find_subdirectories_part2(min_size)
else:
dir_sizes += child.find_subdirectories_part2(min_size)
return dir_sizes
The instruction was the smallest sufficiently large directory, so finally I select the minimum of all directories found above.
total_space = 70000000
space_needed = 30000000
current_empty_space = 70000000 - tree.root.get_size()
possible_dirs = tree.root.find_subdirectories_part2(space_needed - current_empty_space)
min(possible_dirs)
Conclusion
I'm tired. I did this way too late.