Weak Vertices
The Problem
In graph theory, the structural integrity of a network can often be analyzed by identifying fundamental shapes within it, such as triangles. A triangle provides rigidity and is a common motif in many applications. This problem asks us to identify vertices that are not part of any triangle. A vertex i is defined as being part of a triangle if it has two distinct neighbors, j and k, which are also neighbors of each other. Our task is to find all vertices that do not satisfy this condition, which the problem statement refers to as “weak vertices.” The graph is given to us in the form of an adjacency matrix.
My Code
My approach is a direct simulation of the definition. For each vertex, I iterate through all pairs of its neighbors and check if they are connected.
// Iterate through each vertex `i` from 0 to n-1 to check if it's weak.
for(int i=0; i<n; i++) {
// A flag to track if vertex `i` is part of any triangle.
// We initialize it to true, assuming `i` is weak until proven otherwise.
bool flag = true;
// Iterate through all neighbors of `i`.
// The outer loop picks the first neighbor, `j`.
for(int j=0; j<siblings[i].size(); j++) {
// The inner loop picks the second neighbor, `k`.
// We start from j+1 to ensure that we consider each pair of neighbors only once.
for(int k=j+1; k<siblings[i].size(); k++) {
// Check if the two neighbors of `i`, namely `siblings[i][j]` and `siblings[i][k]`,
// are connected to each other.
if(graph[siblings[i][j]][siblings[i][k]]) {
// If they are connected, then vertices `i`, `j`, and `k` form a triangle.
// Thus, vertex `i` is not weak. We set the flag to false.
flag = false;
// Since we have found a triangle involving `i`, we can stop checking for this vertex.
break;
}
}
// If the flag has been set to false, we can break out of the outer neighbor loop as well.
if(!flag) {
break;
}
}
// If, after checking all pairs of neighbors, the flag is still true,
// it means no triangle was found involving `i`. Thus, `i` is a weak vertex.
if(flag) {
cout<<i<<' ';
}
}
My Solution
The problem requires us to identify all vertices in a given undirected graph that are not part of any triangle. A vertex is considered “weak” if this condition holds. The graph’s structure is provided via an $n \times n$ adjacency matrix, where $n$ is the number of vertices.
My solution directly implements the definition provided. The core of the logic revolves around exhaustively checking, for each vertex, whether the condition for being part of a triangle is met. The algorithm proceeds vertex by vertex, from $0$ to $n-1$. For each vertex, which we can call $i$, we adopt the hypothesis that it is a weak vertex. We then try to disprove this hypothesis by searching for evidence of a triangle involving $i$. A triangle involving vertex $i$ would consist of $i$ and two of its neighbors, say $j$ and $k$, where $j$ and $k$ are also connected by an edge.
To perform this check, my algorithm first identifies all neighbors of the vertex $i$. This is done by reading the adjacency matrix during the input phase and storing the neighbors of each vertex in an adjacency list representation, which I’ve called siblings. This pre-processing step simplifies the neighbor lookup process. For a given vertex $i$, siblings[i] contains a list of all vertices $j$ such that there is an edge between $i$ and $j$.
With the list of neighbors for vertex $i$ at hand, the next step is to examine all possible pairs of these neighbors. If vertex $i$ has $d_i$ neighbors, there are $\binom{d_i}{2}$ such pairs. I use a pair of nested loops to iterate through every unique pair of neighbors. Let the neighbors be indexed from $0$ to $d_i-1$ in the siblings[i] list. The outer loop selects a neighbor at index $j$, and the inner loop selects another neighbor at index $k$, where $k > j$. This ensures that each pair is considered exactly once and avoids redundant checks (e.g., checking both $(j, k)$ and $(k, j)$) and self-comparisons (where $j=k$).
For each pair of neighbors, say vertex $u$ (from siblings[i][j]) and vertex $v$ (from siblings[i][k]), the algorithm checks if there is an edge connecting $u$ and $v$. This check is efficiently performed by looking up the entry in the adjacency matrix, graph[u][v]. If this entry is true (or 1), it signifies that the edge $(u, v)$ exists. The existence of edges $(i, u)$, $(i, v)$, and $(u, v)$ confirms the presence of a triangle ${i, u, v}$.
A boolean variable, flag, is used to maintain the status of the current vertex $i$. It is initialized to true, representing the initial assumption that $i$ is weak. As soon as the algorithm finds a single pair of neighbors of $i$ that are connected to each other, it has found a triangle involving $i$. At this point, the hypothesis that $i$ is weak is disproven. The flag is set to false, and since we only need to find one triangle to disqualify a vertex from being weak, the search for this vertex $i$ can be terminated immediately. The break statements are used to exit the inner and outer loops prematurely to improve efficiency.
If the nested loops complete their execution without the flag ever being set to false, it means that for every pair of neighbors of $i$, no direct edge exists between them. This exhaustively proves that there are no triangles involving vertex $i$. Therefore, the initial hypothesis holds, and vertex $i$ is indeed a weak vertex. In this case, the algorithm prints the index $i$. This entire process is repeated for all vertices in the graph, ensuring that every vertex is correctly classified.
To formally prove the correctness of this algorithm, let us use the language of predicate logic. Let $G=(V, E)$ be the graph, where $V={0, 1, \dots, n-1}$ is the set of vertices and $E$ is the set of edges. Let $A$ be the adjacency matrix representation of $G$, such that $A_{uv} = 1$ if $(u,v) \in E$, and $A_{uv} = 0$ otherwise. A vertex $i \in V$ is defined as being part of a triangle, let’s denote this property as $T(i)$, if and only if:
$T(i) \iff \exists j, k \in V \text{ such that } (j \neq i) \land (k \neq i) \land (j \neq k) \land (A_{ij}=1) \land (A_{ik}=1) \land (A_{jk}=1)$.
A vertex $i$ is weak, let’s denote this as $W(i)$, if it is not part of any triangle:
$W(i) \iff \neg T(i) \iff \neg (\exists j, k \in V : (j \neq i) \land (k \neq i) \land (j \neq k) \land A_{ij}=1 \land A_{ik}=1 \land A_{jk}=1)$.
By De Morgan’s laws, this is equivalent to:
$W(i) \iff \forall j, k \in V, \neg ((j \neq i) \land (k \neq i) \land (j \neq k) \land A_{ij}=1 \land A_{ik}=1 \land A_{jk}=1)$.
This simplifies to:
$W(i) \iff \forall j, k \in V, ((A_{ij}=1) \land (A_{ik}=1)) \implies (A_{jk}=0 \lor i=j \lor i=k \lor j=k)$.
Let $N(i)$ be the set of neighbors of $i$, i.e., $N(i) = {j \in V \mid A_{ij}=1}$. The condition for being in a triangle can be restated as:
$T(i) \iff \exists j, k \in N(i) \text{ such that } (j \neq k) \land (A_{jk}=1)$.
Consequently, the condition for being a weak vertex is:
$W(i) \iff \forall j, k \in N(i), (j=k) \lor (A_{jk}=0)$. Since we are only interested in pairs of distinct neighbors, this is equivalent to:
$W(i) \iff \forall j, k \in N(i) \text{ with } j \neq k, \text{ it holds that } A_{jk}=0$.
My algorithm’s logic is a direct implementation of this final statement. The outer for loop ensures that every vertex $i \in V$ is considered. For each $i$, the nested loops iterate through all unique pairs of distinct neighbors ${j, k} \subseteq N(i)$. The condition if(graph[siblings[i][j]][siblings[i][k]]) is a direct check for whether $A_{jk}=1$. If this condition is ever true, the flag is set to false, correctly concluding that $\neg W(i)$ is true. If the loops complete without this condition ever being met, it means that for all pairs of distinct neighbors ${j, k}$, $A_{jk}=0$, which is the definition of $W(i)$. Thus, the algorithm correctly identifies and prints precisely the set of all weak vertices. The correctness is therefore established.
Time and Space Complexity
The analysis of the algorithm’s complexity is crucial for understanding its performance, especially as the size of the graph grows.
Time Complexity:
The dominant part of the algorithm is the set of nested loops that execute for each vertex. Let’s analyze the work done. The main process is enclosed in a for loop that iterates from $i=0$ to $n-1$, where $n$ is the number of vertices. This loop runs $n$ times. Inside this loop, for each vertex $i$, we perform a check for triangle formation. This check involves iterating through pairs of neighbors of $i$. Let $d_i$ denote the degree of vertex $i$, which is the number of neighbors it has (siblings[i].size()). The two nested loops are designed to select every unique pair of these $d_i$ neighbors. The number of such pairs is given by the binomial coefficient $\binom{d_i}{2} = \frac{d_i(d_i-1)}{2}$. For each pair, a constant time lookup is performed in the adjacency matrix. Therefore, the total number of operations for a single vertex $i$ is proportional to $d_i^2$. Summing this over all vertices, the total time complexity $T(n)$ is given by:
$T(n) = \sum_{i=0}^{n-1} O(d_i^2) = O(\sum_{i=0}^{n-1} d_i^2)$.
In the worst-case scenario, the graph is a complete graph ($K_n$), where every vertex is connected to every other vertex. In this case, the degree of every vertex is $d_i = n-1$. The time complexity becomes:
$T(n) = O(\sum_{i=0}^{n-1} (n-1)^2) = O(n \cdot (n-1)^2) = O(n^3)$.
This cubic complexity indicates that the algorithm is well-suited for graphs with a small number of vertices, as specified by the problem constraints ($n \le 20$), where $20^3 = 8000$, which is computationally very feasible.
To prove this formally, let $C$ be the maximum cost of a single check inside the innermost loop (a constant). The total number of operations, $T_{ops}(n)$, can be bounded by: $T_{ops}(n) \le \sum_{i=0}^{n-1} C \cdot \frac{d_i(d_i-1)}{2}$. Since $d_i \le n-1$ for any vertex $i$, we can establish an upper bound: $T_{ops}(n) \le \sum_{i=0}^{n-1} C \cdot \frac{(n-1)(n-2)}{2} = n \cdot C \cdot \frac{(n-1)(n-2)}{2}$. This expression is a polynomial of degree 3 in $n$. According to the definition of Big-O notation, there must exist constants $c > 0$ and $n_0$ such that for all $n \ge n_0$, $T_{ops}(n) \le c \cdot n^3$. Our derived upper bound $n \cdot C \cdot \frac{(n-1)(n-2)}{2}$ clearly satisfies this, so the time complexity is formally $O(n^3)$.
Space Complexity: The space complexity is determined by the amount of memory required to store the graph structure. My solution uses two primary data structures for this purpose:
- An adjacency matrix
graph, which is an $n \times n$ matrix of booleans. This requires $O(n^2)$ space. - An adjacency list
siblings, which is a vector of vectors. In the worst case of a dense graph, the total number of entries across all lists is $2|E|$, where $|E|$ is the number of edges. For a complete graph, $|E| = \binom{n}{2} = O(n^2)$. Thus, the adjacency list also requires $O(n^2)$ space in the worst case. Other variables, such as loop counters and the boolean flag, use only a constant amount of additional space, $O(1)$. Therefore, the total space complexity is dominated by the graph representations, making it $O(n^2)$.
Formally, let $S(n)$ be the total space required. $S(n) = \text{space}(\text{graph}) + \text{space}(\text{siblings}) + \text{space}(\text{other})$. $\text{space}(\text{graph}) = n \times n \times \text{sizeof(bool)} = \Theta(n^2)$. $\text{space}(\text{siblings}) = \sum_{i=0}^{n-1} d_i \times \text{sizeof(int)} = 2|E| \times \text{sizeof(int)} = O(|E|)$. Since $|E| \le n^2$, we have $\text{space}(\text{siblings}) = O(n^2)$. $\text{space}(\text{other}) = O(1)$. So, $S(n) = \Theta(n^2) + O(n^2) + O(1) = O(n^2)$. This concludes the formal complexity analysis.
Where’s my Internet?
The Problem
This problem presents a scenario of houses in a new town that need to be connected to the internet. We are given $N$ houses, numbered 1 to $N$, and $M$ existing network cable connections between pairs of houses. House 1 is the source of the internet connection for the entire town. A house is considered connected to the internet if it is house 1, or if it is connected by a cable to another house that is already connected. The task is to identify and list all the houses that are not yet connected to the internet. If all houses are connected, we should report that.
My Code
This problem is about connectivity in a graph. I recognized it as a perfect use case for a Disjoint Set Union (DSU), also known as a Union-Find data structure. I model the houses as nodes in a graph and the cables as edges. All houses that can reach each other form a connected component, which the DSU can track efficiently.
/// \brief Disjoint Set Union (Union-Find) class
class UnionFind {
private:
vector<int> parent; // parent[i] stores the parent of element i
vector<int> rank; // Used for union by rank optimization
public:
// Constructor initializes n elements, each in its own set.
explicit UnionFind(int n) {
parent.resize(n);
rank.resize(n, 0);
// Initially, each element is its own parent.
for(int i = 0; i < n; i++) {
parent[i] = i;
}
}
// Finds the representative (root) of the set containing x, with path compression.
int find(int x) {
if(parent[x] != x) {
parent[x] = find(parent[x]); // Path compression
}
return parent[x];
}
// Unites the sets containing x and y, using union by rank.
void unite(int x, int y) {
int rootX = find(x);
int rootY = find(y);
if(rootX != rootY) { // Only unite if they are in different sets
if(rank[rootX] < rank[rootY]) {
parent[rootX] = rootY;
} else {
parent[rootY] = rootX;
if(rank[rootX] == rank[rootY]) {
rank[rootX]++;
}
}
}
}
// Checks if x and y are in the same set.
bool same(int x, int y) {
return find(x) == find(y);
}
};
// Main logic
int N,M;
cin >> N >> M;
// Create a DSU structure for N+1 houses (using 1-based indexing).
UnionFind uf = UnionFind(N+1);
while(M--) {
int a, b;
cin >> a >> b;
// Each cable connection means we unite the sets of the two houses.
uf.unite(a,b);
}
bool flag = false; // Flag to check if any house is unconnected.
// Iterate through all houses from 2 to N.
for(int i=2; i<=N; i++) {
// A house `i` is connected to the internet if it's in the same set as house 1.
if(!uf.same(i,1)) {
cout << i << endl; // If not, print its number.
flag = true;
}
}
// If the flag was never set, all houses were connected.
if(!flag) {
cout << "Connected";
}
My Solution
The problem asks us to determine which houses, in a network of $N$ houses and $M$ cable connections, are not connected to the internet. The internet originates from house 1. Connectivity is transitive: if house A is connected to house B, and B is connected to C, then A, B, and C are all mutually connected. A house has internet if it is part of the same connected block as house 1.
This problem can be modeled as finding the connected components of a graph. The houses are the vertices, and the cables are the edges. All houses in the same connected component as vertex 1 will have internet access. All houses in other components will not. My chosen data structure to solve this is the Disjoint Set Union (DSU) or Union-Find data structure. This structure is specifically designed to keep track of a partition of a set of elements into a number of disjoint, non-overlapping subsets. In our case, the set of elements is the set of houses, and each subset will represent a connected component.
Initially, I treat each of the $N$ houses as being in its own separate component. The DSU is initialized with $N+1$ elements (to allow for 1-based indexing from 1 to $N$), where each element is its own “parent” or “representative,” signifying $N$ disjoint sets.
Then, I process the $M$ cable connections one by one. Each connection between two houses, say house a and house b, signifies that these two houses are directly connected. In the context of our connected components, this means that the component containing a and the component containing b should now be considered a single, larger component. The unite(a, b) operation in my DSU class accomplishes this. It finds the representatives of the sets containing a and b. If they are not already the same, it merges the two sets. To keep the structure efficient, I employ two standard optimizations: union by rank and path compression.
- Union by Rank: When merging two sets, the representative of the set with a smaller “rank” (a rough measure of the tree’s depth) is attached to the representative of the set with a larger rank. This helps to keep the trees representing the sets from becoming too tall and unbalanced, which would slow down the
findoperation. - Path Compression: During a
find(x)operation, which traverses the path from elementxto its root representative, we make every node on that path point directly to the root. This dramatically flattens the tree structure over time, making subsequentfindoperations on any of those elements (or their descendants) nearly constant time.
After processing all $M$ cable connections, the DSU structure accurately reflects the connected components of the housing network. Each set in the DSU corresponds to one connected component. The final step is to determine which houses are connected to the internet. A house i has internet if and only if it is in the same connected component as house 1. The DSU provides a highly efficient way to check this: the same(i, 1) function. This function returns true if i and 1 belong to the same set (i.e., they have the same root representative) and false otherwise.
My code iterates through all houses from 2 to $N$. For each house i, it calls uf.same(i, 1). If this returns false, it means house i is not in the same component as house 1 and therefore lacks an internet connection. The number i is then printed. If the loop completes and no such houses are found, it means every house from 2 to $N$ is in the same component as house 1. In this case, the program prints “Connected”.
The formal proof of correctness for this approach rests on the DSU’s ability to correctly model equivalence relations. The “is connected to” relation in a graph is an equivalence relation:
- Reflexive: Any vertex
vis connected to itself (by a path of length 0). - Symmetric: If
uis connected tov, thenvis connected tou(since edges are undirected). - Transitive: If
uis connected tovandvis connected tow, thenuis connected tow. An equivalence relation partitions a set into disjoint equivalence classes. In graph theory, these equivalence classes are precisely the connected components. The DSU data structure is designed to maintain such a partition. Let’s prove that after processing all edges, two vertices $u, v$ are in the same set in the DSU if and only if there is a path between them in the graph.
- Proof by Induction on the number of
uniteoperations (edges processed): - Base Case: Before any edges are processed ($m=0$), each vertex is in its own set. This is correct, as there are no paths between distinct vertices.
- Inductive Hypothesis: Assume that after processing $k$ edges, the DSU correctly represents the connected components of the graph formed by these $k$ edges.
- Inductive Step: Consider processing the $(k+1)$-th edge, $(u, v)$. The
unite(u, v)operation merges the sets containing $u$ and $v$. In the graph, this new edge creates a path between any node in $u$’s old component and any node in $v$’s old component. Therefore, these two components should merge into one, which is exactly what theuniteoperation does. Any two nodes that were in the same component before remain so. Any two nodes that were in different components and are not in the newly merged component also remain in different components. Thus, the DSU partition remains correct after processing the $(k+1)$-th edge. By induction, after all $M$ edges are processed, the DSU correctly partitions the vertices into the graph’s connected components. A houseihas internet if and only if there is a path fromito1. This is equivalent toiand1being in the same connected component. My algorithm checks this usinguf.same(i, 1). Therefore, the logic is correct.
Time and Space Complexity
Time Complexity: The overall time complexity is determined by three parts: initialization of the DSU, processing the $M$ cable connections, and the final check for connectivity.
- Initialization: The
UnionFindconstructor initializes theparentandrankarrays for $N+1$ elements. This involves a single loop that runs $N+1$ times. The complexity is $O(N)$. - Processing Connections: We loop $M$ times. In each iteration, we perform two
findoperations and oneuniteoperation. With the crucial optimizations of path compression and union by rank (or size), the amortized time complexity of a singlefindoruniteoperation is nearly constant. It is more formally expressed using the inverse Ackermann function, $\alpha(N)$, which is a very slowly growing function. For all practical purposes in computing, $\alpha(N)$ is less than 5. Thus, the cost of one operation can be considered $O(\alpha(N))$. The total time for processing all $M$ connections is therefore $O(M \cdot \alpha(N))$. - Final Check: We loop from house 2 to $N$, which is $N-1$ iterations. In each iteration, we perform a
same(i, 1)call, which involves twofindoperations. The total time for this part is $O(N \cdot \alpha(N))$.
Combining these parts, the total time complexity is $O(N + M \cdot \alpha(N) + N \cdot \alpha(N))$, which simplifies to $O((N+M)\alpha(N))$. Given the constraints ($N, M \le 200,000$), this is highly efficient.
To formally state this, let’s denote the cost of a DSU operation as $T_{dsu}(N)$. With both path compression and union by rank/size, it is known that a sequence of $m$ operations on $n$ elements takes $O(m \cdot \alpha(n))$ time. In our case, the initialization takes $T_{init}(N) = c_1 N$ time. The main loop performs $M$ unite operations, each of which calls find twice. This is a sequence of $3M$ operations. The final loop performs $N-1$ same operations, each calling find twice, for a total of $2(N-1)$ operations. The total number of DSU operations is $m = 3M + 2(N-1)$. The total time is $T(N, M) = T_{init}(N) + T_{ops}(N, M) = c_1 N + O((3M + 2(N-1))\alpha(N)) = O((N+M)\alpha(N))$. This confirms the complexity.
Space Complexity: The memory usage is primarily determined by the storage required for the DSU data structure.
- The
parentvector stores an integer for each of the $N+1$ houses. This requires $O(N)$ space. - The
rankvector also stores an integer for each of the $N+1$ houses, also requiring $O(N)$ space. The other variables are of constant size. Therefore, the total space complexity of my solution is $O(N)$.
Formally, $S(N) = \text{space}(\text{parent}) + \text{space}(\text{rank}) = c_1 (N+1) + c_2 (N+1) = O(N)$.
Grid
The Problem
This problem challenges us to find the shortest path on a grid. We are given an $n \times m$ grid where each cell contains a single digit, say $k$. From a cell containing the digit $k$, we can make a “move” by jumping exactly $k$ squares in one of the four cardinal directions (up, down, left, or right). We are not allowed to jump off the grid. The goal is to find the minimum number of moves required to travel from the top-left corner (cell (0, 0)) to the bottom-right corner (cell (n-1, m-1)). If the destination is unreachable, we should report -1.
My Code
This problem is a shortest path problem on an unweighted graph. The grid cells can be thought of as nodes, and the possible jumps as edges. A Breadth-First Search (BFS) is the classic algorithm for finding the shortest path in an unweighted graph. My implementation uses a priority queue, which makes it resemble Dijkstra’s algorithm. However, since all “edge weights” are implicitly 1 (each move counts as one step), this approach functions identically to a standard BFS and correctly finds the shortest path.
// A struct to hold the state: coordinates (x, y) and the number of steps taken.
struct state {
public:
int x;
int y;
int step;
};
// Main logic
int n, m;
cin >> n >> m;
// Read the grid from input.
vector<vector<int>> grid(n, vector<int>(m));
for(int i = 0; i < n; i++) {
string line;
cin >> line;
for(int j = 0; j < m; j++) {
grid[i][j] = line[j] - '0';
}
}
// A set to keep track of visited grid cells to avoid cycles and redundant computations.
unordered_set<pair<int, int>, pair_hash, pair_equal> visited{};
// A priority queue to manage states to visit. While a simple queue is sufficient
// for a standard BFS, a priority queue also works correctly here because all
// edge weights are uniform (1). It will explore layer by layer.
// My custom comparator is not strictly necessary but was added for experimentation.
auto comp = [](const state &a, const state &b) {
return a.step > b.step;
};
priority_queue<state, vector<state>, decltype(comp)> pq{};
pq.push(state{0, 0, 0}); // Start at (0,0) with 0 steps.
while(!pq.empty()) {
auto top = pq.top();
pq.pop();
int x = top.x;
int y = top.y;
// If we have already processed this cell, skip it.
if(visited.count({x, y})) {
continue;
}
visited.insert(make_pair(x, y));
// Get the jump distance from the current cell.
int jump_dist = grid[x][y];
// Define the four possible moves.
pair<int, int> next_moves[4] = {
make_pair(x + jump_dist, y), // Down
make_pair(x - jump_dist, y), // Up
make_pair(x, y + jump_dist), // Right
make_pair(x, y - jump_dist) // Left
};
for(auto [next_x, next_y]: next_moves) {
// Check if the destination is reached.
if(next_x == n - 1 && next_y == m - 1) {
cout << top.step + 1; // Found the shortest path.
return 0;
}
// Check if the move is valid (within grid bounds) and not to a zero-jump cell.
if(next_x >= 0 && next_x < n && next_y >= 0 && next_y < m && grid[next_x][next_y]!=0) {
// Push the new state to the queue.
pq.push(state{next_x, next_y, top.step + 1});
}
}
}
// If the queue becomes empty and the destination was not reached, it's impossible.
cout << -1;
My Solution
This problem asks for the minimum number of moves to get from a starting cell to a target cell in a grid. This is a classic example of a shortest path problem on a graph. The graph in question can be constructed by considering each cell $(i, j)$ of the grid as a vertex. An edge exists from vertex $(i, j)$ to another vertex $(i’, j’)$ if we can legally jump from the first cell to the second in a single move. Since each move counts as 1, the graph is unweighted. The problem is then reduced to finding the shortest path from the vertex corresponding to cell $(0, 0)$ to the vertex for cell $(n-1, m-1)$.
The canonical algorithm for finding the shortest path in an unweighted graph is the Breadth-First Search (BFS). BFS systematically explores the graph layer by layer, starting from the source vertex. It discovers all vertices at distance 1, then all vertices at distance 2, and so on. This property guarantees that the first time BFS reaches the destination vertex, it will have done so via a shortest path.
My solution implements this BFS-like traversal. I use a data structure, in this case a priority queue, to store the “frontier” of the search—the set of cells that have been reached but whose neighbors have not yet been explored. A state struct is used to store the coordinates of a cell and the number of moves (step) taken to reach it. The search starts by placing the initial state, {0, 0, 0}, into the queue.
The main part of the algorithm is a loop that continues as long as the queue is not empty. In each iteration, it extracts a state from the queue. Let’s say we extract the state for cell $(x, y)$, reached in top.step moves. To prevent infinite loops in case of cycles and to avoid redundant computations, I use a visited set. If the cell $(x, y)$ has already been visited, we simply discard the current state and proceed to the next one. Otherwise, we mark $(x, y)$ as visited.
Next, the algorithm determines the jump distance, $k = \text{grid}[x][y]$, from the current cell. It then calculates the coordinates of the four potential destination cells by adding or subtracting $k$ from the current coordinates. For each potential move to a cell $(next_x, next_y)$:
- It first checks if this new cell is the target destination, $(n-1, m-1)$. If it is, we have found a path. Since BFS explores paths in increasing order of length, this must be a shortest path. The length is the current path length (
top.step) plus one more move. The program prints this number and terminates. - If it’s not the destination, the algorithm verifies that the move is valid: the new coordinates must be within the grid’s boundaries ($0 \le next_x < n$ and $0 \le next_y < m$). I also added a check to ensure the jump value at the destination cell is not zero, which would create a dead end.
- If the move is valid, a new state
{next_x, next_y, top.step + 1}is created and added to the queue for future exploration.
If the main loop finishes (i.e., the queue becomes empty) and the destination has not been reached, it means that the destination cell is in a part of the graph that is not reachable from the starting cell. In this case, the algorithm prints -1.
Now, let’s provide a formal proof of correctness. The algorithm is a variant of Dijkstra’s algorithm, which, on a graph with uniform positive edge weights (all equal to 1 in this case), behaves identically to BFS. Let’s prove that it finds the shortest path.
Let $d(v)$ be the length of the shortest path from the source vertex $s$ (cell $(0,0)$) to any vertex $v$. We want to prove that when the algorithm terminates upon reaching the destination $t$ (cell $(n-1, m-1))$, the path length it found, step+1, is equal to $d(t)$.
Let’s prove by induction that when a vertex $v$ is extracted from the priority queue, the value v.step is equal to $d(v)$.
- Base Case: The first vertex extracted is the source $s$, with
s.step = 0. The shortest path from a source to itself is indeed 0. So, the base case holds. - Inductive Hypothesis: Assume for all vertices $u$ extracted from the queue before a vertex $v$, it holds that
u.step= $d(u)$. - Inductive Step: Let $v$ be the next vertex to be extracted from the queue. Let $u$ be the vertex that caused $v$ to be added to the queue. This means there is an edge $(u, v)$ and
v.step = u.step + 1. By the inductive hypothesis,u.step = d(u). Therefore,v.step = d(u) + 1. This means there is a path to $v$ of lengthv.step. We now need to show this is the shortest path. Consider any path from $s$ to $v$. Let this path be $s=v_0, v_1, \dots, v_k=v$. Let $v_i$ be the first vertex on this path that has not yet been extracted from the queue. All its predecessors $v_0, \dots, v_{i-1}$ have been extracted. Let $v_{i-1}$ be the immediate predecessor. When $v_{i-1}$ was processed, the state for $v_i$ would have been pushed to the queue with a step count ofv_{i-1}.step + 1. By the inductive hypothesis,v_{i-1}.step = d(v_{i-1}) = i-1. So, $v_i$ is in the queue with a step value of $i$. Since my priority queue (and a standard BFS queue) processes nodes in non-decreasing order of theirstepvalue, $v$ could not have been extracted before $v_i$ ifv.step > v_i.step. Since the path $s, \dots, v_k=v$ has length $k$, we know $d(v) \le k$. The vertex $v_i$ on this path is in the queue with step value $i$. The length of the path to $v$ is $k$. Any path to $v$ must pass through a frontier node. The algorithm always expands the node on the frontier with the smallest step count. Therefore, when it reaches $v$, it must have done so via a path of length $d(v)$. Sov.step = d(v). Since the algorithm terminates as soon as it reaches the destination $t$, the step count at that point will be $d(t)$, the shortest path length. If $t$ is never reached, the queue will become empty, correctly indicating no path exists.
Time and Space Complexity
Time Complexity:
The time complexity of this search algorithm depends on the number of vertices and edges in the graph. In our grid, the number of vertices $|V|$ is $n \times m$. Each vertex has at most 4 outgoing edges. So, the number of edges $|E|$ is at most $4 \times n \times m$, which is $O(nm)$.
The main loop runs as long as the priority queue is not empty. Each vertex (cell) is added to the queue at most once because of the visited set. When we process a vertex, we do a constant number of operations (checking its 4 neighbors). The operations on the priority queue (push and pop) take logarithmic time in the size of the queue. The maximum size of the queue is $|V|=nm$.
So, the complexity is dominated by the queue operations for each vertex and edge. The total time complexity is $O(|V| \log |V| + |E|) = O(nm \log(nm) + 4nm) = O(nm \log(nm))$.
Note: If a standard queue were used for a pure BFS, the enqueue and dequeue operations would be $O(1)$, leading to a time complexity of $O(|V|+|E|) = O(nm)$. Since my solution uses a priority queue, the logarithmic factor is included in the analysis.
Let’s formalize this. Let $N = n \times m$. The number of vertices is $N$. The number of edges is at most $4N$. The algorithm is essentially Dijkstra’s.
- Pushing the initial state: $O(\log N)$.
- The
whileloop runs at most $N$ times (once per vertex). - Inside the loop:
pq.pop(): $O(\log N)$.visited.count()andvisited.insert()on an unordered set are on average $O(1)$.- The inner loop runs 4 times.
- Inside the inner loop,
pq.push(): $O(\log N)$. Total time: $N \times (O(\log N) + O(1) + 4 \times O(\log N)) = N \times O(\log N) = O(nm \log(nm))$.
Space Complexity:
The space required is determined by the data structures used to store the grid, the visited set, and the priority queue.
- Grid: The
griditself is an $n \times m$ matrix, requiring $O(nm)$ space. - Visited Set: In the worst case, the
visitedset can store a pair of coordinates for every cell in the grid. This requires $O(nm)$ space. - Priority Queue: The priority queue can also, in the worst case, hold a
statefor every cell in the grid. This also requires $O(nm)$ space. Summing these up, the total space complexity is $O(nm) + O(nm) + O(nm) = O(nm)$.
Formally, $S(n, m) = \text{space}(\text{grid}) + \text{space}(\text{visited}) + \text{space}(\text{pq})$. Let $N=nm$. $S(n, m) = c_1 N + c_2 N + c_3 N = O(N) = O(nm)$.
Oddities
The Problem
This is a fundamental programming exercise focused on the concept of parity. We are asked to determine whether a given integer is odd or even. An integer $n$ is defined as even if it is a multiple of two (i.e., can be expressed as $n=2k$ for some integer $k$), and odd otherwise. The program needs to handle multiple test cases, reading an integer and printing whether it is odd or even.
My Code
The solution is straightforward and relies on the modulo operator (%), which gives the remainder of a division. A number is even if its remainder when divided by 2 is 0, and odd if the remainder is 1. I also handle negative numbers by taking the absolute value first.
// The number of test cases `n` is read first.
int n;
cin>>n;
// Loop `n` times to process each test case.
while(n--) {
int x;
cin>>x;
// To correctly handle negative numbers like -5, we first take the absolute value.
// Then, we use the modulo operator (%) to find the remainder when divided by 2.
// If abs(x) % 2 is 1, the number is odd.
// If abs(x) % 2 is 0, the number is even.
// The ternary operator `(condition ? value_if_true : value_if_false)`
// is used for a compact output string.
cout << x << " is " << ((abs(x) % 2 == 1) ? "odd" : "even") << endl;
}
My Solution
The problem is to classify a given integer $x$ as either “even” or “odd”. The mathematical definition of parity is central to the solution. An integer $z$ is called even if it can be written as $z=2k$ for some integer $k$. An integer is called odd if it can be written as $z=2k+1$ for some integer $k$. This definition is a direct consequence of the Division Algorithm from number theory, which states that for any integer $a$ and any positive integer $d$, there exist unique integers $q$ (quotient) and $r$ (remainder) such that $a = dq + r$ and $0 \le r < d$.
When we apply the Division Algorithm with the divisor $d=2$, we find that for any integer $x$, there exist unique integers $q$ and $r$ such that $x = 2q + r$ and $0 \le r < 2$. This means the only possible values for the remainder $r$ are 0 or 1.
- If $r=0$, then $x=2q$. By definition, $x$ is an even number.
- If $r=1$, then $x=2q+1$. By definition, $x$ is an odd number.
This provides a clear and unambiguous method for determining the parity of any integer: we just need to find its remainder when divided by 2. The modulo operator (%) in C++ and many other programming languages is designed to compute this remainder. So, x % 2 will yield the remainder of $x$ divided by 2.
A small subtlety arises with negative numbers. The behavior of the modulo operator with negative operands can differ across programming languages. In C++, the result of a % n has the same sign as a. For example, -5 % 2 results in -1. To ensure a consistent and simple check, my solution first takes the absolute value of the input integer x using abs(x). This maps both positive and negative integers to their non-negative counterparts. For any non-negative integer, abs(x) % 2 will be either 0 (for even numbers) or 1 (for odd numbers). This approach works because the parity of an integer $x$ is the same as the parity of its absolute value, $|x|$. If $x = 2k$, then $|x| = |2k| = 2|k|$, which is even. If $x = 2k+1$, then $|x|=|2k+1|$. If $k \ge 0$, $|x|=2k+1$, which is odd. If $k < 0$, say $k’ = -k > 0$, then $x = -2k’+1$. $|x| = |-(2k’-1)| = 2k’-1 = 2(k’-1)+1$, which is also odd. Thus, checking abs(x) % 2 is a robust method.
My code implements this logic concisely. It reads an integer $x$, computes abs(x) % 2, and uses a ternary operator to select the correct string (“odd” or “even”) based on whether the result is 1 or 0.
The formal proof of correctness is grounded in the Division Algorithm as explained above.
Let $P(x)$ be the proposition “the algorithm correctly determines the parity of integer $x$”. We need to prove $\forall x \in \mathbb{Z}, P(x)$.
Case 1: $x$ is even.
By definition, $\exists k \in \mathbb{Z}$ such that $x = 2k$.
Then $|x| = |2k| = 2|k|$.
The expression abs(x) % 2 in the code computes $|x| \pmod 2$.
$|x| \pmod 2 = (2|k|) \pmod 2 = 0$.
The code checks if this value is 1. It is not. The ternary operator (0 == 1 ? "odd" : "even") evaluates to “even”. The algorithm correctly outputs that $x$ is even.
Case 2: $x$ is odd.
By definition, $\exists k \in \mathbb{Z}$ such that $x = 2k+1$.
Then $|x| = |2k+1|$.
The expression abs(x) % 2 computes $|x| \pmod 2$.
We know $|x| \pmod 2 = ((2k+1) \pmod 2 \text{ if } 2k+1 \ge 0) \lor ((-(2k+1)) \pmod 2 \text{ if } 2k+1 < 0)$.
In the first subcase, $(2k+1) \pmod 2 = 1$.
In the second subcase, let $k’ = -k-1$. Then $-(2k+1) = -2k-1 = -2(k’+1)-1 = 2(-k’-1)+1$. This is of the form $2q+1$, so its remainder when divided by 2 is 1.
In all subcases, if $x$ is odd, $|x| \pmod 2 = 1$.
The code checks if this value is 1. It is. The ternary operator (1 == 1 ? "odd" : "even") evaluates to “odd”. The algorithm correctly outputs that $x$ is odd.
Since the algorithm is correct for both even and odd integers, it is correct for all integers.
Time and Space Complexity
Time Complexity:
The algorithm processes each of the $n$ test cases independently inside a while loop. For each test case, the work done is constant. It involves:
- Reading an integer (
cin). - Calling
abs(), which is a constant-time operation. - Performing a modulo operation (
%), which is a constant-time operation for native integer types. - Performing a comparison (
==). - Printing the result (
cout). All these steps take a small, constant amount of time, let’s call it $C$. The total time complexity is therefore the number of test cases $n$ multiplied by this constant time, which is $O(n)$.
Formally, let $T(n)$ be the total time for $n$ test cases. $T(n) = \sum_{i=1}^{n} C_i$, where $C_i$ is the constant time taken for the $i$-th test case. So, $T(n) = n \cdot C$. By the definition of Big-O notation, this is $O(n)$.
Space Complexity:
The algorithm uses only a few variables to store the number of test cases (n) and the integer for the current test case (x). The amount of memory used does not depend on the magnitude of the input numbers (within the limits of the int type) or the number of test cases. Therefore, the space complexity is constant, $O(1)$.
Formally, $S(n) = \text{space}(n) + \text{space}(x) = \text{sizeof(int)} + \text{sizeof(int)} = C’$, where $C’$ is a constant. This is $O(1)$.
Counting Chocolate
The Problem
This problem is a variation of the classic Partition Problem. We are given a collection of $n$ boxes of chocolate, each containing a certain number of pieces. We need to determine if it’s possible to divide these boxes between two people, John and Sam, such that the total number of chocolate pieces each person receives is exactly the same. All boxes must be distributed.
My Code
This problem can be solved by checking if there exists a subset of the boxes whose total number of pieces is exactly half of the grand total. If such a subset exists, we can give it to John, and the remaining boxes will automatically sum to the same amount for Sam. This is a subset sum problem. Given the small constraints, I opted for a recursive (backtracking) solution to explore all possible subsets.
// Recursive function to solve the subset sum problem.
// a: the vector of chocolate pieces in each box.
// john: current sum of pieces for John.
// sam: current sum of pieces for Sam.
// sp: the starting point (index) in the vector `a` for the current recursive call.
// target: the target sum for each person (total_sum / 2).
bool solve(const vector<int>& a, int john, int sam, int sp, int target) {
// Base case: if we have considered all boxes.
if(sp == a.size()) {
// If we reached the end, a solution is only found if one person hit the target exactly.
// However, the checks below make this redundant.
return false;
}
// Success condition: if either person's sum has reached the target.
if(john == target || sam == target) {
return true;
}
// Pruning: if either person's sum has exceeded the target, this path is invalid.
if(john > target || sam > target) {
return false;
}
// Recursive step: explore two choices for the box at index `sp`.
// Choice 1: Give the current box to John.
// Choice 2: Give the current box to Sam.
// The `||` (OR) means we return true if either choice leads to a solution.
return solve(a, john + a[sp], sam, sp + 1, target) || solve(a, john, sam + a[sp], sp + 1, target);
}
// Main logic
int n;
cin>>n;
vector<int> a = vector<int>(n);
int sum = 0;
for(int i=0; i<n; i++) {
cin>>a[i];
sum += a[i];
}
// A crucial initial check: If the total sum of chocolates is odd,
// it's impossible to divide it into two equal integer halves.
if(sum % 2 == 1) {
cout << "NO";
return 0;
}
// Call the recursive solver. We start with both sums at 0, from the first box (index 0),
// with the target being half the total sum.
if(solve(a, 0, 0, 0, sum / 2)) {
cout << "YES";
} else {
cout << "NO";
}
My Solution
The problem asks if a given set of integers (the number of chocolates in each box) can be partitioned into two subsets with equal sums. Let the set of chocolate counts be $S = {a_1, a_2, \dots, a_n}$. We are looking for a partition of $S$ into two disjoint subsets, $S_1$ and $S_2$, such that $S_1 \cup S_2 = S$, $S_1 \cap S_2 = \emptyset$, and $\sum_{x \in S_1} x = \sum_{y \in S_2} y$.
First, a simple but powerful observation can be made. If the two subsets have equal sums, say $K$, then the total sum of all chocolates, $\sum_{z \in S} z$, must be $K+K = 2K$. This implies that the total sum must be an even number. If the total sum is odd, it’s mathematically impossible to partition it into two equal integer sums. My algorithm begins with this check: it calculates the total sum of all pieces, and if this sum is odd, it immediately prints “NO” and terminates. This is a valid and efficient pruning of the entire problem space.
If the total sum is even, let it be $2K$. Our goal is now to find a subset $S_1 \subseteq S$ whose elements sum up to exactly $K$. If we can find such a subset, then the elements in the remaining subset $S_2 = S \setminus S_1$ will have a sum of $2K - K = K$, satisfying the condition. So, the problem is transformed into the Subset Sum Problem: given a set of integers, is there a non-empty subset whose sum is equal to a given integer $K$?
The Subset Sum Problem is a classic NP-complete problem. This means that there is no known algorithm that can solve it in polynomial time for all inputs. However, for small inputs, we can solve it using methods like dynamic programming or, as I have chosen, recursion with backtracking. My solve function embodies this backtracking approach.
The function solve(a, john, sam, sp, target) explores the possibilities. The parameters represent the current state of the decision-making process: john and sam are the current sums for the two people, sp is the index of the current box we are considering, and target is the desired sum $K$.
The logic of the recursion is as follows: For each box a[sp], we have two choices:
- Give the box to John.
- Give the box to Sam.
The function explores both of these choices through recursive calls. In the first call, solve(a, john + a[sp], sam, sp + 1, target), we add the current box’s value to John’s total and move on to consider the next box (sp + 1). In the second call, solve(a, john, sam + a[sp], sp + 1, target), we explore the alternative where the box is given to Sam. The || (logical OR) operator combines the results. If either of these branches of exploration eventually leads to a solution, the function returns true.
The recursion has several base cases and pruning conditions:
- Success: If at any point
john == target(orsam == target), we have successfully found a subset that sums to the target. We can immediately returntrue. Thistruevalue will propagate up the call stack. - Pruning: If
john > targetorsam > target, it means the current path has already overshot the target sum. Since all chocolate counts are positive, this sum will only increase. This branch cannot lead to a solution, so we can prune it by returningfalse. - Failure: If
sp == a.size(), it means we have considered every box. If we have not yet returnedtrue, it means this particular distribution of boxes did not result in either person having a sum of exactly $K$. We returnfalse.
The initial call to the function is solve(a, 0, 0, 0, sum/2). This starts the process with empty sums for both people, considering the first box, with the target being half of the total sum.
The correctness of this backtracking algorithm stems from the fact that it explores the entire search space of all possible partitions. The decision for each box (a[sp]) to go to John or Sam creates a binary decision tree. The leaves of this tree represent all $2^n$ possible ways to distribute the $n$ boxes. The algorithm systematically traverses this tree. The pruning conditions (john > target, sam > target) help to cut off branches that are guaranteed not to lead to a solution, but they do not affect correctness because they only discard invalid paths. Since the algorithm explores every valid possibility, if a solution exists, it is guaranteed to be found. If the function returns false, it means the exhaustive search completed without finding any valid partition.
Let $P(S, K)$ be the problem of finding if a subset of $S$ sums to $K$. My recursive function, let’s call it $R(i, j_s, s_s)$, where $i$ is the index of the item to consider, and $j_s, s_s$ are the current sums, computes whether a partition is possible. The correctness can be formally proven by induction on the number of items remaining, $k = n-i$.
- Base Case ($k=0$, i.e., $i=n$): No items are left. A solution exists if and only if $j_s=K$ or $s_s=K$. My function’s base cases cover this.
- Inductive Hypothesis: Assume $R(i+1, \dots)$ correctly solves the problem for the sub-array starting at $i+1$.
- Inductive Step: To solve for $R(i, j_s, s_s)$, we consider item $a_i$. A solution exists if (a) we give $a_i$ to John and a solution exists for the rest, i.e., $R(i+1, j_s+a_i, s_s)$ is true, OR (b) we give $a_i$ to Sam and a solution exists for the rest, i.e., $R(i+1, j_s, s_s+a_i)$ is true. This is precisely what the line
return solve(...) || solve(...)does. By the inductive hypothesis, the recursive calls are correct. Therefore, the logic for step $i$ is also correct.
Time and Space Complexity
Time Complexity: The recursive algorithm, in its pure form, explores a binary tree of decisions. For each of the $n$ boxes, there are two choices. This leads to a total of $2^n$ possible distributions (leaves in the decision tree). The algorithm, in the worst case, might have to explore all of them. At each step of the recursion, a constant amount of work is done. Therefore, the time complexity is exponential, $O(2^n)$. The pruning I included can reduce the search space in practice, but the worst-case complexity remains exponential. Given the problem constraints ($n \le 1000$), this solution would be too slow. However, for typical competitive programming constraints where such an approach is intended to pass, $n$ is usually much smaller (e.g., $n \le 20$). There appears to be a mismatch between the provided solution’s complexity and the problem’s stated constraints. Assuming the constraints are looser in practice or the test cases are weak, this is the complexity of the code as written. For the stated constraints, a dynamic programming approach with complexity $O(n \cdot \text{sum})$ would be required.
Formally, let $T(i)$ be the time to solve the problem for the subarray from index $i$ to $n-1$. $T(i) = T(i+1) + T(i+1) + C = 2T(i+1) + C$. This recurrence relation unfolds to $T(0) = O(2^n)$.
Space Complexity:
The space complexity is determined by the maximum depth of the recursion call stack. The recursion goes from index sp=0 to sp=n. This means the maximum depth of the call stack will be $n$. Each call on the stack stores its parameters and some local variables, which takes a constant amount of space. Therefore, the space complexity is proportional to the number of boxes, $O(n)$.
Formally, let $S(n)$ be the space required. The recursion depth is $n$. Each stack frame requires constant space $C$. So, $S(n) = n \cdot C = O(n)$. The vector a also takes $O(n)$ space. Total space is $O(n)$.
Maximize Sum of Squares of Digits
The Problem
This LeetCode problem asks us to construct a positive integer n with a specific number of digits, num, and a specific sum of those digits, sum. Among all “good” integers that satisfy these two conditions, we need to find the one that has the maximum possible “score,” where the score is defined as the sum of the squares of its digits. If multiple numbers yield the same maximum score, we should return the largest one. If no such number exists, we return an empty string.
My Code
To maximize the sum of squares of digits for a fixed sum, the digits should be as “uneven” as possible. For example, for a sum of 10, the digits 9 and 1 give a score of $9^2+1^2=82$, whereas 5 and 5 give $5^2+5^2=50$. This suggests a greedy approach: use as many 9s as possible. To make the resulting number as large as possible, these 9s should be placed in the most significant positions.
string Solution::maxSumOfSquares(int num, int sum) {
// Create an output string stream for easy string building.
ostringstream oss = {};
// A basic check for impossibility: the maximum possible sum for `num` digits
// is `9 * num`. If the required `sum` is greater, no solution exists.
if(9 * num < sum) {
return "";
}
// `cnt` tracks how many digits we have placed so far.
int cnt = 0;
// Greedily place as many '9's as possible.
while(sum > 9) {
oss << 9; // Add a '9' to our number.
cnt++; // Increment the digit count.
sum -= 9; // Decrease the remaining sum we need to achieve.
}
// After the loop, `sum` is between 0 and 9 (inclusive).
// This `sum` becomes the next digit.
oss << sum;
cnt++;
// If we haven't reached the required `num` digits yet,
// the remaining digits must be '0's.
while(num - cnt > 0) {
cnt++;
oss << 0;
}
return oss.str();
}
My Solution
The problem has two objectives layered on top of each other: first, maximize the sum of the squares of the digits, and second, among those with the maximal score, find the largest number. The constraints are a fixed number of digits (num) and a fixed sum of those digits (sum).
Let the digits of the number be $d_1, d_2, \dots, d_{\text{num}}$. We are given:
- $\sum_{i=1}^{\text{num}} d_i = \text{sum}$
- $0 \le d_i \le 9$ for all $i$.
We want to maximize the score $S = \sum_{i=1}^{\text{num}} d_i^2$.
The core mathematical insight here is that for a fixed sum, the sum of squares is maximized when the numbers are as far apart as possible. Consider two digits $a$ and $b$ with a sum $a+b=C$. If we change them to $a-1$ and $b+1$ (assuming $a>0, b<9$), the new sum is still $C$, but the new sum of squares is $(a-1)^2 + (b+1)^2 = a^2 - 2a + 1 + b^2 + 2b + 1 = (a^2+b^2) + 2(b-a+1)$. If $b \ge a$, this change increases the sum of squares. This “spreading out” of values increases the sum of squares. We can generalize this: to maximize $\sum d_i^2$ subject to $\sum d_i = C$, we should make some $d_i$ as large as possible.
This insight leads to a greedy strategy. To maximize the score, we should make the digits as large as possible. The largest possible digit is 9. Therefore, our strategy should be to use as many 9s as we can. For each 9 we use, we satisfy 9 of the required sum and use up one of the num available digit slots.
My algorithm implements this greedy choice. It repeatedly appends the digit ‘9’ to the result, subtracting 9 from the remaining sum each time, until the remaining sum is 9 or less. At this point, the remaining sum (which is between 0 and 9) must be used as the next digit. After placing this digit, if we still have not used up all num digit slots, the remaining slots must be filled with ‘0’s, as using any other digit would alter the sum.
Now, let’s consider the second objective: returning the largest number among those with the maximum score. A number is larger if its most significant digits (the ones on the left) are larger. My greedy approach of placing the 9s first, followed by the next largest digit, and then the 0s, naturally constructs the number in descending order of its digits. This ensures that the largest possible digits (the 9s) occupy the most significant positions, thus creating the lexicographically largest and numerically greatest number possible for that set of digits.
For example, if num = 3 and sum = 18, my algorithm would generate 9, then 9, then the remaining sum is 0. The digits are {9, 9, 0}. The number constructed is “990”. This number has the maximum possible score ($9^2+9^2+0^2 = 162$) and is the largest possible permutation of these digits.
There is an initial check for feasibility: if (9 * num < sum). The maximum possible sum for num digits is achieved if all digits are 9, which gives a sum of $9 \times \text{num}$. If the required sum exceeds this, it is impossible to form such a number, so the function correctly returns an empty string. The problem statement also implies a lower bound check is needed. The sum of digits must be at least 1 for a positive integer, which is covered by the constraints.
The formal proof of correctness for this greedy algorithm relies on an “exchange argument”. Let our greedy solution produce a sequence of digits $G = (g_1, g_2, \dots, g_n)$, where $n=\text{num}$. Let $O = (o_1, o_2, \dots, o_n)$ be any other valid sequence of digits for an optimal solution. We want to show that the score of $G$ is at least as high as the score of $O$, and that the number formed by $G$ is at least as large as the number formed by $O$ if the scores are equal. Assume the digits in both sequences are sorted in descending order for score analysis: $g_1’ \ge g_2’ \ge \dots$ and $o_1’ \ge o_2’ \ge \dots$. Our greedy algorithm produces the lexicographically largest possible sequence of digits that maximizes the sum of squares. Let’s find the first index $i$ where $g_i’ \neq o_i’$. Since our algorithm picks the largest possible digits first, it must be that $g_i’ > o_i’$. Since both sequences have the same sum, there must be some other index $j > i$ where $o_j’ > g_j’$. Consider the digits ${o_i’, o_j’}$. We have $o_i’ < g_i’ \le 9$ and $o_j’ > g_j’ \ge 0$. Let’s modify the optimal solution $O$ by changing the pair $(o_i’, o_j’)$ to $(o_i’+1, o_j’-1)$. The sum of digits remains the same. The new sum of squares is $(o_i’+1)^2 + (o_j’-1)^2 = o_i’^2 + 2o_i’ + 1 + o_j’^2 - 2o_j’ + 1 = (o_i’^2 + o_j’^2) + 2(o_i’ - o_j’ + 1)$. Since $o_i’ < o_j’$ (because the sequence is sorted), $o_i’ - o_j’ + 1 \le 0$. This transformation might not increase the score. Let’s re-evaluate the core property. The function $f(x)=x^2$ is a convex function. For any set of numbers $d_i$ with $\sum d_i = C$, the sum $\sum d_i^2$ is maximized when the values are at the boundaries of their domain. So, we should use as many 9s and 0s as possible. My greedy choice of using 9s is correct. It makes some digits as large as possible (9), forcing others to be as small as possible (0s, and one remainder digit). This configuration maximizes the sum of squares. Any other configuration could be transformed into our greedy configuration by repeatedly taking 1 from a digit $d_i < 9$ and adding it to a digit $d_j < 9$ to make one of them larger. This “robbing Peter to pay Paul” strategy increases the sum of squares. So, the set of digits produced by the greedy algorithm is optimal for the score. My algorithm also arranges these digits in descending order, which by definition produces the largest number. Thus, the algorithm is correct.
Time and Space Complexity
Time Complexity:
The algorithm’s runtime is determined by the number of digits num and the sum.
- The
while(sum > 9)loop runs at mostsum / 9times. In each iteration, it performs a constant number of operations (append, decrement, increment). So this part is $O(\text{sum})$. - The
while(num - cnt > 0)loop runs at mostnumtimes. This part is $O(\text{num})$. The total time complexity is therefore $O(\text{sum} + \text{num})$.
Formally, let $T(\text{num}, \text{sum})$ be the running time.
The first loop runs $k_1 = \lfloor (\text{sum}-1)/9 \rfloor$ times.
The second loop runs $k_2 = \text{num} - k_1 - 1$ times.
The total work is proportional to $k_1 + k_2 = \text{num}-1$. However, building the string can take longer. If we assume appending to an ostringstream is amortized constant time, the logic holds. If it’s proportional to the string length, it would be a sum of lengths from 1 to num, resulting in $O(\text{num}^2)$. But ostringstream is optimized. A simpler analysis bounds the total number of appends by num. The number of loops is bounded by sum and num. So $O(\text{sum} + \text{num})$ is a safe upper bound.
Space Complexity:
The space used is primarily for the output string being built. The final string will have num digits. The ostringstream will use space proportional to the number of digits. Therefore, the space complexity is $O(\text{num})$.
Formally, $S(\text{num}, \text{sum})$ is the space for the ostringstream. Its final size is num characters. So $S(\text{num}, \text{sum}) = O(\text{num})$.
Minimum Operations to Transform Array
The Problem
This problem was presented as minOperations which takes two integer vectors, nums1 and nums2. Based on the code, it seems to be a custom transformation problem. The goal is to calculate a “cost” or “number of operations”. The calculation involves a base cost, which is the sum of element-wise absolute differences between nums1 and drevantor, and an additional cost related to a special value extra, which is the last element of nums2.
My Code
The code I wrote calculates a value based on a specific formula. It first computes a base step count by summing up all the abs(nums1[i] - nums2[i]) differences. Then, it determines an additional cost related to the extra value. It finds the element in either nums1 or nums2 that is “nearest” to extra. An additional operation seems to be required, and its cost depends on whether extra falls between any (nums1[i], nums2[i]) pair.
long long Solution::minOperations(vector<int> &nums1, vector<int> &nums2) {
// The 'extra' value is defined as the last element of the second array.
int extra = nums2.back();
long long step_cnt = 0;
// `nearest` will store the index of the element closest to `extra`.
int nearest = 0;
// `nearest_d` stores the minimum distance found so far.
int nearest_d = INT_MAX;
// `flag` will be true if `extra` is "covered" by any range [nums1[i], nums2[i]].
bool flag = false;
// `status` indicates whether the nearest element was found in nums1 (1) or nums2 (2).
int status = 1;
// Iterate through both arrays simultaneously.
for(int i = 0; i < nums1.size(); i++) {
// Find the element in either array that is closest to `extra`.
if(abs(nums1[i] - extra) < nearest_d) {
nearest = i;
status = 1;
nearest_d = abs(nums1[i] - extra);
}
if(abs(nums2[i] - extra) < nearest_d) {
nearest = i;
status = 2;
nearest_d = abs(nums2[i] - extra);
}
// Check if `extra` lies within the closed interval defined by nums1[i] and nums2[i].
if((nums1[i] <= extra && extra <= nums2[i]) || (nums2[i] <= extra && extra <= nums1[i])) {
flag = true;
}
// Accumulate the base cost, which is the sum of element-wise differences.
step_cnt += abs(nums1[i] - nums2[i]);
}
// Determine the final cost based on the `flag` and `status`.
if(flag) {
// If `extra` was covered, the cost is the base cost plus 1.
return step_cnt + 1;
} else if(status == 1) {
// If not covered, and the nearest element was in nums1, add 1 and the distance.
return step_cnt + 1 + abs(nums1[nearest] - extra);
} else { // status == 2
// If not covered, and the nearest element was in nums2, add 1 and the distance.
return step_cnt + 1 + abs(nums2[nearest] - extra);
}
}
My Solution
This problem appears to define a unique cost function for transforming nums1 into nums2 with a special consideration for an extra value. My solution meticulously implements the calculation of this cost as defined by the underlying (unseen) problem logic. Let’s break down the components of the cost and rationalize the logic.
The core of the problem seems to be about transforming nums1 to nums2. A common way to measure the “distance” or “cost of transformation” between two arrays is the sum of absolute differences, also known as the Manhattan distance or $L_1$ norm of the difference vector. The line step_cnt += abs(nums1[i] - nums2[i]); calculates this base cost. This can be interpreted as the minimum number of increment/decrement operations needed to make nums1[i] equal to nums2[i] for all i.
The complexity arises from the special role of extra = nums2.back(). The final cost is not just step_cnt. There is an additional component. My algorithm’s logic suggests this is a “connection cost” to the extra value.
The code iterates through the arrays to determine two things regarding extra:
- Coverage (
flag): It checks ifextrais contained within any of the intervals[min(nums1[i], nums2[i]), max(nums1[i], nums2[i])]. If it is, the booleanflagis set totrue. This suggests that ifextracan be “formed” by a value betweennums1[i]andnums2[i], the connection cost is minimal. - Proximity (
nearest,nearest_d,status): Concurrently, it finds the element across bothnums1andnums2that is numerically closest toextra. It records the index (nearest), the minimum distance (nearest_d), and which array the element came from (status). This seems to be a fallback plan for calculating the connection cost if the “coverage” condition is not met.
The final return statement synthesizes these findings:
if (flag): Ifextrawas “covered” by at least one interval, the total cost isstep_cnt + 1. This can be interpreted as: the base transformation cost, plus a single operation to “activate” the connection toextra, which is cheap because it’s already within range.else: Ifextrawas not covered, the cost is higher. It isstep_cnt + 1 + nearest_d. This implies: the base transformation cost, plus one activation operation, plus an additional cost equal to the minimum distance toextra. This additional cost represents the effort needed to “pull” the nearest existing value towardsextrato establish the connection.
The correctness of this solution is predicated on it being a correct implementation of the specific, custom cost function defined by the problem. Assuming the problem is to calculate this exact cost, my algorithm is correct because it systematically computes each component of the formula.
- It correctly accumulates the sum of absolute differences.
- It correctly iterates through all elements to check for the coverage condition and updates the
flagappropriately. - It correctly maintains the minimum distance to
extraand the location of the element that achieves this minimum distance. - It uses a conditional structure that correctly applies the final formula based on the computed
flagandstatus.
Let’s formalize the cost function $C$ that the algorithm computes.
Let $S = \sum_{i=0}^{n-1} |nums1_i - nums2_i|$.
Let $E = nums2_{n-1}$.
Let the “coverage” predicate be $P = \exists i \in [0, n-1] \text{ s.t. } (E \ge \min(nums1_i, nums2_i)) \land (E \le \max(nums1_i, nums2_i))$.
Let $d_{min} = \min_{i \in [0, n-1]} (\min(|nums1_i - E|, |nums2_i - E|))$.
The algorithm computes the cost $C$ as:
$C = \begin{cases} S + 1 & \text{if } P \text{ is true} \ S + 1 + d_{min} & \text{if } P \text{ is false} \end{cases}$
My code is a direct and correct implementation of this function. The loops ensure that all $i$ are checked for the existence condition $P$ and that the minimum distance $d_{min}$ is found. The final if-else block correctly applies the two cases of the piecewise function.
Time and Space Complexity
Time Complexity:
The algorithm is dominated by a single for loop that iterates through the arrays from i = 0 to nums1.size() - 1. Let $n$ be the size of the arrays. The loop runs $n$ times.
Inside the loop, all operations are constant time:
- Absolute value calculations.
- Comparisons.
- Assignments.
- Arithmetic operations. Therefore, the work done inside the loop is $O(1)$. The total time complexity is $n$ times this constant work, which is $O(n)$.
Formally, let $T(n)$ be the running time. $T(n) = \sum_{i=0}^{n-1} C$, where $C$ is the constant cost of the operations inside the loop. $T(n) = n \cdot C = O(n)$.
Space Complexity:
The algorithm uses a handful of variables (extra, step_cnt, nearest, nearest_d, flag, status) to store intermediate values. The amount of memory used is constant and does not depend on the size of the input arrays. Therefore, the space complexity is $O(1)$ (not including the space for the input arrays themselves).