Some CTF problems don’t really feel like they belong in a CTF challenge—this is one of those problems. For this one we’re given the following prompt: \[\textit{I made a program to solve a problem, but it seems too slow :(}\] This code, a file called \(\textit{encoded.txt}\) with two columns of \(1,769,610\) values that looks like this: \[\begin{bmatrix} 0 & 66 \\ 1 & 77611334 \\ 2 & 7296865972 \\ 3 & 6005985327 \\ \vdots & \vdots \\ \end{bmatrix}\] And the following hint: \[ax^2+bx+c\] The problem essentially presents a set of points \((x, y)\) where \(x\) values range from 0 to 1769610, and \(y\) values appear random. On first glance it appears that we need to solve a system of linear equations represented by a Vandermonde matrix.


Our system of linear equations looks like this:

\[\begin{bmatrix} a_0 \\ a_1 \\ a_2 \\ \vdots \\ a_n \end{bmatrix} \begin{bmatrix} x_0^0 & x_0^1 & \cdots & x_0^n \\ x_1^0 & x_1^1 & \cdots & x_1^n \\ x_2^0 & x_2^1 & \cdots & x_2^n \\ \vdots & \vdots & \ddots & \vdots \\ x_n^0 & x_n^1 & \cdots & x_n^n \end{bmatrix} = \begin{bmatrix} y_0 \\ y_1 \\ y_2 \\ \vdots \\ y_n \end{bmatrix}\]

which is a definitional Vandermonde matrix. One can easily see how the original code will not work, as you’re attempting to solve a 1769610 by 1769610 matrix using np.linalg.solve, which relies on LU decomposition and is thus completely unfeasible. Following the principles of Reed–Solomon error correction and taking the hint into account, we can use Lagrange interpolation with degree two. The Lagrange interpolation formula is given by:

\[f(x) = \sum_{i=1}^{n+1} f(x_i) \prod_{\substack{j \neq i \\ j = 1}}^{n+1} \frac{x - x_j}{x_i - x_j}\]

To reduce the complexity of calculating the Lagrange basis polynomials, we use the derivative of the product function \(P(x) = \prod_{i=1}^{n+1} (x - x_i)\):

\[\prod_{\substack{j \neq i}} (x_i - x_j) = \lim_{x \to x_i} \frac{P(x)}{x - x_i}\] We can simplify this into: \[\lim_{x \to x_i} \frac{d}{dx} \left( \prod_{j=1}^{n+1} (x - x_j) \right)\] This multiplication of multiple polynomials can be optimized using a sort of divide-and-conquer, which reduces the complexity to \(O(n \log(n) \log(m))\), by using recursion and splitting the list of polynomials into halves/recursively multiplying them. Set \(v_i = \frac{y_i}{P'(x_i)}\), where \(P'(x_i)\) is the derivative of \(P(x)\) evaluated at \(x_i\). The final interpolation can then be written as: \[f(x) = \sum_{i=1}^{n+1} v_i \prod_{\substack{j \neq i}} (x - x_j)\] To get started:

  1. Parse the input points and construct the Vandermonde matrix and solution vector.

  2. Implement a function for polynomial multiplication using divide and conquer.

  3. Compute the derivative \(P'(x)\) at each \(x_i\).

  4. Construct the final interpolation function \(f(x)\) using \(v_i = \frac{y_i}{P'(x_i)}\).

In my opinion, this problem is not great for capture the flag. It still takes several hours to run the code (assume you don’t further optimize) and is overall a bit obtuse. But with this all in mind, Good luck!! Update: I highly reccomend checking out this solution, as it incorporates several different aspects to greatly speed up computation:


Problem Code

import galois
    import numpy as np
    MOD = 7514777789
    points = []
    for line in open('encoded.txt', 'r').read().strip().split('\n'):
        x, y = line.split(' ')
        points.append((int(x), int(y)))
    GF = galois.GF(MOD)
    matrix = []
    solution = []
    for point in points:
        x, y = point
        solution.append(GF(y % MOD))
        row = []
        for i in range(len(points)):
            row.append(GF((x ** i) % MOD))
    open('output.bmp', 'wb').write(bytearray(np.linalg.solve(GF(matrix), GF(solution)).tolist()[:-1]))

Buy Me a Git Push 🖥️