ST3247 Assignments

DSA4212

Introduction

A 2D self-avoiding random walk (SAW) is a path on the square lattice \(\mathbb{Z}^2\) that does not visit the same site more than once. Unlike standard random walks, which allow revisits to previously occupied sites, self-avoiding walks impose a strict non-recrossing condition, making them a fundamental model in statistical mechanics, polymer physics, combinatorics and probability theory. Self-avoiding walks are particularly interesting because they provide insights into the behavior of polymers, have connections to critical phenomena in statistical physics, have led to new results in combinatorial mathematics and probability theory. From a mathematical perspective, SAWs are notoriously difficult to analyze due to the combinatorial explosion of possible paths. Even determining the number of possible SAWs of a given length on a lattice remains a challenging combinatorial problem. The difficulty arises from the correlations introduced by the self-avoidance constraint, making standard probabilistic techniques inapplicable.

In this assignment, you will focus on estimating the number \(c_L\) of self-avoiding walks of a given length \(L\) on the 2D square lattice, as illustrated in the figure above. This remains an open problem with no exact formula known up to this date. As this problem is important to many areas of mathematics, a number of advanced computational techniques have been developed to study them. The purpose of this project is for you to explore some of these techniques and deepen your understanding of Monte Carlo methods.

Problem Statement

Consider the two-dimensional square lattice, denoted as \(\mathbb{Z}^2 = \{(x, y) : x, y \in \mathbb{Z}\}\). A self-avoiding walk (SAW) of length \(L\) is a sequence of \(L+1\) distinct vertices \((x_0, y_0), (x_1, y_1), \dots, (x_L, y_L)\) in \(\mathbb{Z}^2\) such that each successive vertex \(z_{i+1} = (x_{i+1}, y_{i+1})\) is a nearest neighbor of \(z_i = (x_i, y_i)\) for all \(i = 0, 1, \dots, L-1\). We denote such a walk as \(z_{0:L} = (z_0, z_1, \dots, z_L)\) and assume that it begins at the origin, i.e., \(z_0 = (0, 0)\). The number of self-avoiding walks of length \(L\) is denoted by \(c_L\), with known values \(c_1 = 4\) and \(c_2 = 12\). This quantity serves as the normalization constant for the uniform distribution \(\mathbb{P}_L\) over all self-avoiding walks of length \(L\) since we have

\[ \mathbb{P}_L(z_{0:L}) = \frac{1}{c_L} \mathbf{1}\{z_{0:L} \text{ is a SAW of length } L\}. \]

This means that the Monte-Carlo techniques for evaluating normalization constants can be applied to estimate \(c_L\). The number of self-avoiding walks grows exponentially with \(L\), and it is known that \(c_L \approx \mu^L\), where \(\mu \approx 2.6\) is called the connective constant. However, the exact value of \(\mu\) remains unknown. The main objective of this assignment is to estimate \(\mu\), or equivalently, the quantity \(\lambda = \log \mu\):

\[ \lambda = \lim_{N \to \infty} \; \frac{1}{N} \log c_N. \]

Computing \(c_N\) directly is infeasible for large \(N\) due to its exponential growth. Moreover, Monte Carlo methods are challenging to apply since the self-avoiding constraint makes uniform sampling difficult. For instance, in a standard random walk of length \(L\), there are \(4^L\) possible paths, but only a small fraction are self-avoiding. Given that \(c_L \approx \mu^L\) with \(\mu \approx 2.6\), the probability that a random walk is self-avoiding is approximately \((\mu/4)^L\). For \(L = 20\), this probability is roughly \(0.02\%\).

Organization of the report

The first few sections of your report should be as follows:

Introduction: Briefly introduce the problem and describe clearly what the main goal of the assignment is. This section is to make sure that you have understood the problem statement correctly. This section can be brief.
Basic deterministic methods: Implement a method that can compute \(c_N\) for small values of \(L\), at least up to \(L=10\). This will be useful to validate your Monte Carlo methods.
Basic Monte Carlo I: Implement a simple Monte-Carlo method that generates standard random walks of length \(L\) and estimates the fraction of self-avoiding walks.
Basic Monte Carlo II: Implement a more sophisticated Monte-Carlo method that generates a walk of length \(L\) by sampling the next step \(z_{i+1}\) uniformly from the set of possible neighbors of \(z_i\) that do not lead to a self-intersection. If there is no such neighbor, which can happen, the walk remains still at \(z_i\) until the end of the walk, i.e. \(z_{i+1} = z_i\) until \(i = L-1\). Discuss how can this procedure be used to estimate \(c_N\).

After the above \(4\) sections, you are then tasked with exploring more advanced Monte Carlo methods to estimate \(\lambda\) and \((\log c_N)/N\) for large values of \(N\).

Practical Details

Submission: A single submission is required for each group. The submission is on canvas. The submission should include a report and the code. The code can be submitted as a zip file (not exceeding 10 MB) or as a link to a public repository (e.g., github). The report should be in PDF format and should not include the code. The report should be at most 10 pages long, including figures and references. It is entirely possible to obtain a full grade with a report that is significantly shorter than 10 pages: do not write a longer report just to meet the page limit. Although not compulsory, you are encouraged to use Latex to write the report – equations are straightforward to write and this produce neat reports. A template is available here and with your NUS email, you can get a free account on Overleaf for collaborative editing.

Criteria: In this project, you will be evaluated on the following criteria:

Your exploration and understanding of various Monte Carlo methods for estimating \(\lambda\). You will be evaluated on the quality and breadth of your numerical experiments, your critical analysis of the results, as well as the extend to which you have attempted to explore the problem and the literature.
50% of the grade is based on the content of the report. The clarity and organization of your report are important.
50% of the grade is based on the quality of the code. The code should be well-documented and easy to read.
It is entirely fine to use a LLM-assistant to improve the quality of your writing and code, as well as for brainstorming ideas.

All sources (github repositories, papers, blog posts, etc.) should be properly cited. Failure to do so will be considered plagiarism. Although it is fine (and even encouraged) to discuss the assignments with other students/groups, the report and the code should be entirely your own work. You are encouraged to research extensively, experiment with different approaches, and explore creative ideas. A deep understanding of the methods used and the ability to explain them clearly is crucial. Do not just copy code from the internet without understanding it. This project is open-ended, and you are expected to explore the problem in depth.

As a last remark, the assignment is designed to be challenging. Do not be discouraged if you find it difficult. The goal is to learn and to improve your skills. If you are stuck, do not hesitate to ask for help and reach out to me. Furthermore, it may be a good idea to build a proper git repository to keep track of your progress and to showcase your work when applying for internships or jobs. There is a lot of room for creativity in this project and exploring new ideas is encouraged.

Deadlines:

Friday 28 March 2025: I am giving you the opportunity to submit a draft of your report; you can add a section with the title “Questions” where you can ask me specific questions about the project. I will provide feedback on the draft, and you can use this feedback to improve your final submission. The draft should include the first \(4\) sections of the report highlighted above. The draft should be submitted on canvas. Note that this draft is optional, but I highly recommend that you submit it. This draft will not be graded: its only purpose is to provide you with feedback and improve your final submission.
Friday 25 April 2025: Final submission of the report and the code. The report should be submitted on canvas. The code can be submitted as a zip file (not exceeding 10 MB) or as a link to a public repository (e.g., github). The report should be in PDF format and should not include the code. The report should be at most 10 pages long, including figures and references. It is entirely possible to obtain a full grade with a report that is significantly shorter than 10 pages: do not write a longer report just to meet the page limit. Do include the name and student number of all group members in the report.