# DSA4212 Assignments

### Traveling Salesman Problem [80%]

**Goal:** The Traveling Salesman Problem (TSP) is a classic problem in combinatorial optimization. Given a list of cities and the distances between each pair of cities, the problem is to find the shortest possible route that visits each city exactly once and returns to the origin city. You are tasked with implementing a few methods to solve the TSP and compare their performance. Your report should not include any actual code. Instead, it should describe the problem and provide a discussion of the results. You are encouraged to compare your results with the best known solutions.

**Possible References:**

### Thompson problem [80%]

**Goal:** The Thompson problem is a classic problem in computational geometry. Consider `N = 300`

points \(P_1, P_2, \ldots, P_N \in \mathbb{R}^3\) on the unit sphere: for \(1 \leq i \leq N\), we have \(\|P_i\| = 1\). How should these points be placed so that the quantity

\[ E = \sum_{i=1}^{N} \sum_{j=1}^{N} \frac{1}{\|P_i - P_j\|} \]

is minimized. Note that it is the norm, and not the norm squared, in the denominator. Your report should not include any actual code. Instead, it should describe the problem and provide a discussion of the results. You are encouraged to compare your results with the best known solutions.

**Possible References:**

### Image classification with convolutional neural networks [80%]

**Goal:** Implement a simple convolutional neural network in Python and use it to classify images from the CIFAR-10 dataset. For this purpose, you will implement a small CNN on a GPU (eg. Google Colab) – the actual details of the CNN is not the emphasis of this assignment and you should not spend too much time optimizing the architecture. Instead, you will study:

- the influence of the choice of optimizer

- the effect of the learning rate on the speed of convergence and the final test accuracy

- whether using a learning rate schedule improves performance

- the effect of the batch size on the speed of convergence and the final test accuracy

Your report should not include any actual code. Instead, it should describe your experiments and their results, and provide a discussion of the findings.

**Possible References:**

- Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates

- How to train your ResNet
- Deep Learning Tuning Playbook

### Automatic differentiation [90%]

**Goal:** Implement a simple version of automatic differentiation in Python. Your implementation should be able to easily compute the gradient of the composition of any number of (simple) multivariate functions. While your report should not include any actual code, it should describe the mechanics of your implementation and provide examples of its use.

**Possible References:**

### Collaborative Filtering [90%]

**Goal:** Implement a collaborative filtering algorithm in Python from scratch – you should not use any high-level library such as `surprise`

since this would defeat the purpose of the assignment. For testing your implementation, you will use the song dataset available here. You are tasked to implement variations of the matrix factorization algorithms discussed in class and compare their performance with other approaches of your choice. Your report should not include any actual code. Instead, it should describe the approach you used, the results, and provide a discussion of the findings.

### Natural Evolution Strategies [100%]

The class of algorithms known as NaturalEvolutionStrategies (NES) has demonstrated significant potential in various fields of applied mathematics and machine learning. In this project, you will provide an overview of NES and apply these approaches to three optimization problems of your choice. Your report should not include any actual code. Instead, it should describe what NES are, how they work, and provide a discussion of the results.

**Possible References:**

- Information-geometric optimization algorithms: A unifying pic- ture via invariance principles

- Evolution Strategies as a Scalable Alternative to Reinforcement Learning

- Evolution Strategies
- Natural Evolution Strategies

### Word Embeddings [100%]

**Goal:** Implement a simple word embedding in Python (from scratch) and use it to find the most similar words to a given word. Your report should not include any actual code. Instead, it should describe how word embeddings work, how you implemented it, and provide a discussion of the results.

**Possible References:**

### LSTM for time series prediction [100%]

**Goal:** Implement a simple LSTM in Python and from scratch, ie. not use the `LSTM(...)`

function of a high-level library. You will then use your implementation to predict the next value in a time series dataset of your choice. Your report should not include any actual code. Instead, it should describe how a LSTM works, how you implemented it, and provide a discussion of the results.

**Possible References:**

- Understanding LSTM Networks
- Zico Kolter

- The Unreasonable Effectiveness of Recurrent Neural Networks
- Can recurrent neural networks warp time?

### Transformer architecture [100%]

**Goal:** Implement a simple transformer in Python and from scratch, ie. not use the `Transformer(...)`

function of a high-level library.

Some examples of simple task that can be used to test the transformer are:

- Sequence Reversal: reversing a sequences of numbers,
`[1,3,2,4,5,3] -> [3,5,4,2,3,1]`

.

- Basic Arithmetic: learning simple arithmetic operation is a surprisingly challenging task for a transformer. For example, given a sequence of numbers and arithmetic operations, the model should output the result of the operations.
`[1,3,2,"+",9,4,2]->[1,0,7,4]`

.`[1,3,2,"+",9,4,2]->[1,0,7,4]`

- Copying Task: A simple task where the model’s objective is to output the same sequence it receives as input. This might seem trivial but is a good starting point:
`[1,3,2,4,5,3]->[1,3,2,4,5,3]`

- Sorting Numbers: This can be very challenging for a transformer, especially if the sequence is long.
`[1,3,2,4,5,3]->[1,2,3,3,4,5]`

- Character-Level Text Generation: Generating text one character at a time, based on a given prompt or starting character.

Your report should not include any actual code. Instead, it should describe how a transformer works, how you implemented it, and provide a discussion of the results.

**Possible References:**

- The Annotated Transformer

- JAX: Transformers and Multi-Head Attention
- GPT in 60 lines of code
- The Transformer Family
- The Illustrated Transformer
- pytorch transformer from scratch

- Transformers: Zero to Hero

### Topic of your choice

Propose a topic and contact me to discuss it and for approval.