feat: add UFDS flashcards and project learnings infrastructure

This commit is contained in:
2026-05-04 09:01:28 +08:00
parent 69676a84be
commit 3ab8ba001d
4 changed files with 498 additions and 0 deletions
+9
View File
@@ -31,3 +31,12 @@ code
## Self-Improvement
Periodically review this file and suggest improvements to the user if you notice gaps, inconsistencies, or missing conventions.
## Active Context
<!-- AI assistant maintains this section. Keep under 20 lines. -->
<!-- Updated automatically by /self-improve. Remove stale entries. -->
- Branch: `master`, 1 commit ahead of origin (unpushed)
- Untracked files: `org/cpp/dsa/` and `org/cpp/ufds.org` (not yet committed)
- Current work: UFDS flashcards (402-line proper card set) + DSA subdirectory
- Inbox items: binary search, `using` keyword — need cards created
- Possible cleanup: `org/cpp/dsa/udfs.org` may be a superseded draft of `org/cpp/ufds.org`
+46
View File
@@ -0,0 +1,46 @@
# Project Learnings
> Auto-maintained by the self-improve skill. Read at session start, updated at session end.
## Patterns That Work
<!-- Approaches that produced good results -->
## Mistakes to Avoid
<!-- Failed approaches and why they failed -->
## Codebase Conventions
**[2026-05-04] — File organization**
- Observation: Most flashcard files are flat in `org/cpp/`, but subdirectories exist (`tricks/`, `dsa/`) for topic grouping. AGENTS.md only documents the flat convention.
- Action: When creating new cards, use flat `org/cpp/topic.org` for STL/language features; subdirectories for broader categories (DSA, tricks). Propose updating AGENTS.md if this solidifies.
- Confidence: medium
**[2026-05-04] — Card format variance**
- Observation: `org/cpp/dsa/udfs.org` uses raw `#+title:` + code blocks without ANKI properties. `org/cpp/ufds.org` follows the proper Anki card format. The proper format (with ANKI_NOTE_TYPE, Front/Back sections) is what gets exported.
- Action: Always use the full Anki card format from AGENTS.md when creating flashcards. Raw code files in dsa/ may be scratch/reference, not export-ready cards.
- Confidence: medium
**[2026-05-04] — Naming: UFDS not UDFS**
- Observation: `org/cpp/dsa/udfs.org` is a typo — the data structure is "Union-Find Disjoint Set" = UFDS. The properly-formatted file `org/cpp/ufds.org` uses the correct name.
- Action: Use "ufds" spelling. The dsa/udfs.org appears to be an earlier draft.
- Confidence: high
## Environment & Config
**[2026-05-04] — Git state**
- Observation: Single branch `master` with remote `origin/master`. No branching workflow — commits go directly to master.
- Action: Commit directly to master. Push when work is complete.
- Confidence: high
## Business Context
**[2026-05-04] — Study focus**
- Observation: Recent commits cover STL containers (deque, array, set, map, iterators) and DSA (UFDS). Inbox has LeetCode solutions (two sum, max consecutive ones) with notes to learn binary search and `using` keyword.
- Action: Current study trajectory is STL containers + competitive programming DSA. Prioritize cards for these topics.
- Confidence: high
## Open Questions
**[2026-05-04] — Duplicate UFDS files**
- Question: Are both `org/cpp/ufds.org` (402 lines, proper format) and `org/cpp/dsa/udfs.org` (41 lines, raw code) needed? The former seems to supersede the latter.
- Action: Ask user if `dsa/udfs.org` should be removed or merged.
+41
View File
@@ -0,0 +1,41 @@
#+title: Udfs
* impl
#+begin_src cpp
#include <vector>
#include <numeric>
#include <algorithm>
class Ufs {
private:
std::vector<int> p;
std::vector<int> s;
std::vector<int> r;
public:
Ufds(int n) {
p.resize(n);
std::iota(p.begin(), p.end(), 0);
s.assign(n, 1); // test on equality with the assign fill and iota explain the difference
std::fill(r.begin(),r.end(),0);
numSets = n;
}
~Ufds() {}
~Ufds() = default;
int find(int x) {
if (p[x] == x) return x;
return p[x] = find(p[x]);
}
int find(int x) {
int px = p[x];
if (px != x) {
px = find(px);
}
p[x] = px;
return px;
}
};
#+end_src
should ask what new returns? delete? what happens? why are we deleting it ? heap allocated?
should show equivalent version of using std::array to vector
should ask what resize does?
what about a true dynammic version where we create it upon calling find()
+402
View File
@@ -0,0 +1,402 @@
* Ufds: Union-Find Disjoint Set :cpp:datastructure:
:PROPERTIES:
:ANKI_NOTE_TYPE: Basic
:END:
** Front
Write a minimal Ufds class with vector parent, constructor initializing parents to self
** Back
#+begin_src c++
#include <vector>
#include <numeric>
class Ufds {
private:
std::vector<int> parent;
public:
Ufds(int n) : parent(n) {
std::iota(parent.begin(), parent.end(), 0);
}
};
#+end_src
* Ufds find() with path compression :cpp:datastructure:
:PROPERTIES:
:ANKI_NOTE_TYPE: Basic
:END:
** Front
Write the ~find()~ method for Ufds with path compression
** Back
#+begin_src c++
int find(int x) {
if (parent[x] != x) {
parent[x] = find(parent[x]);
}
return parent[x];
}
#+end_src
* Concise Ufds find() :cpp:datastructure:
:PROPERTIES:
:ANKI_NOTE_TYPE: Basic
:END:
** Front
Is this correct? What does it do?
#+begin_src c++
int find(int x) {
if (parent[x] == x) return x;
return parent[x] = find(parent[x]);
}
#+end_src
** Back
Yes, correct and equivalent to the longer version.
- Base case: if ~x~ is its own parent, return ~x~
- Recursive case: find root of parent, then assign and return
The assignment ~parent[x] = find(parent[x])~ returns the result while compressing the path.
* Ufds constructor: initializer list vs body :cpp:cpp:
:PROPERTIES:
:ANKI_NOTE_TYPE: Basic
:END:
** Front
What is the difference between these two Ufds constructors?
#+begin_src c++
// Initializer list
Ufds(int n) : parent(n) {
std::iota(parent.begin(), parent.end(), 0);
}
// Body style
Ufds(int n) {
parent.resize(n);
std::iota(parent.begin(), parent.end(), 0);
}
#+end_src
** Back
Initializer list: direct constructs ~parent~ with size ~n~ (one allocation)
Body style: default constructs ~parent~, then resize (potential double allocation)
Both produce same result, initializer list is more efficient.
* What does std::vector::resize do? :cpp:cpp:
:PROPERTIES:
:ANKI_NOTE_TYPE: Basic
:END:
** Front
What does ~resize(n)~ do on a ~std::vector<int>~?
** Back
Resizes the vector to have ~n~ elements:
- If ~n~ > current size: adds elements (value-initialized, usually 0)
- If ~n~ < current size: truncates the vector
- Reallocates if capacity < n
#+begin_src c++
std::vector<int> v;
v.resize(5); // v = {0, 0, 0, 0, 0}
#+end_src
* Ufds destructor: when to omit :cpp:cpp:
:PROPERTIES:
:ANKI_NOTE_TYPE: Basic
:END:
** Front
Should Ufds have a destructor? ~Ufds() {}~ vs ~Ufds() = default~
** Back
Omit entirely if only using ~std::vector~ (auto-cleanup)
If you must write one, prefer ~= default~ to explicitly show intent
Empty ~{}~ works but ~= default~ is clearer intent for future maintenance
* Correct Ufds constructor: is this correct? :cpp:cpp:
:PROPERTIES:
:ANKI_NOTE_TYPE: Basic
:END:
** Front
Is this correct?
#+begin_src c++
Ufds(int n) {
p.resize(n);
std::iota(p.begin(), p.end(), 0);
}
#+end_src
** Back
Yes, correct. This is the body-style initialization.
One potential issue: ~p~ is default-constructed first, then ~resize~ may reallocate.
For efficiency, prefer initializer list: ~Ufds(int n) : p(n)~
* Incorrect Ufds constructor: correct or wrong? :cpp:cpp:
:PROPERTIES:
:ANKI_NOTE_TYPE: Basic
:END:
** Front
Is this correct?
#+begin_src c++
std::vector<int> p;
Ufds(int n) {
p = new std::vector<int>(n);
}
#+end_src
** Back
Wrong.
~new std::vector<int>(n)~ returns a ~vector<int>*~ (pointer), but ~p~ is a ~vector<int>~ (not a pointer).
Can't assign a pointer to a vector.
* What does new return? :cpp:cpp:
:PROPERTIES:
:ANKI_NOTE_TYPE: Basic
:END:
** Front
What does ~new Type~ return? And ~new Type[n]~?
** Back
~new Type~ returns a pointer to that type: ~Type*~
~new Type[n]~ returns a pointer to an array: ~Type*~
#+begin_src c++
int* p1 = new int; // single int
int* p2 = new int[10]; // array of 10 ints
#+end_src
* What does delete do? :cpp:cpp:
:PROPERTIES:
:ANKI_NOTE_TYPE: Basic
:END:
** Front
What does ~delete~ do? ~delete[]~?
** Back
~delete~ frees a single object allocated with ~new~
~delete[]~ frees an array allocated with ~new[]~
#+begin_src c++
int* p1 = new int;
int* p2 = new int[10];
delete p1; // free single object
delete[] p2; // free array
#+end_src
Mismatching ~delete~ with ~new[]~ causes undefined behavior.
* Is this correct Ufds with C-style array? :cpp:cpp:
:PROPERTIES:
:ANKI_NOTE_TYPE: Basic
:END:
** Front
Is this correct?
#+begin_src c++
class Ufds {
private:
int* parent;
public:
Ufds(int n) : parent(new int[n]) {}
~Ufds() { delete[] parent; }
};
#+end_src
** Back
Correct. This manually manages the heap-allocated array.
- ~new int[n]~ allocates array of ~n~ ints on heap
- ~delete[] parent~ in destructor frees it
This works but requires manual cleanup — error prone compared to ~std::vector~.
* Equivalent std::array version: correct or wrong? :cpp:cpp:
:PROPERTIES:
:ANKI_NOTE_TYPE: Basic
:END:
** Front
Is this correct?
#+begin_src c++
#include <array>
#include <numeric>
class Ufds {
private:
std::array<int, 10> parent;
public:
Ufds() {
std::iota(parent.begin(), parent.end(), 0);
}
};
#+end_src
** Back
Correct for compile-time fixed size.
But size ~10~ is hardcoded — not parameterized.
~std::array<T, N>~ requires ~N~ be a compile-time constant.
For runtime size, must use ~std::vector~.
* std::array vs vector vs C-style array :cpp:cpp:
:PROPERTIES:
:ANKI_NOTE_TYPE: Basic
:END:
** Front
Compare ~std::array<T, N>~, ~std::vector<T>~, and ~T*~ for Ufds parent array
** Back
| Type | Size | Cleanup | Use when |
|------------------+--------------+-------------------+----------------------------|
| ~std::array<T, N>~ | compile-time | automatic | size known at compile time |
| ~std::vector<T>~ | runtime | automatic | size known at runtime |
| ~T* p = new T[n]~ | runtime | manual (~delete[]~) | legacy code only |
* Vector vs Array: is vector backed by array? :cpp:cpp:
:PROPERTIES:
:ANKI_NOTE_TYPE: Basic
:END:
** Front
Is ~std::vector~ backed by an array? How does it support variable length?
** Back
Yes, ~std::vector~ allocates a contiguous memory block (like a heap array).
Typical implementation: three pointers
- ~start~ — pointer to data
- ~finish~ — pointer to last element
- ~end_of_storage~ — pointer to end of allocated capacity
#+begin_src c++
vector<int> v(5);
v.push_back(1); // may trigger reallocation if capacity exceeded
#+end_src
When capacity is exceeded, vector: allocates larger block, copies elements, deallocates old block.
* Vector size vs capacity :cpp:cpp:
:PROPERTIES:
:ANKI_NOTE_TYPE: Basic
:END:
** Front
What is the difference between ~size()~ and ~capacity()~ in ~std::vector~?
** Back
- ~size()~ — number of actual elements stored
- ~capacity()~ — allocated storage space
#+begin_src c++
std::vector<int> v(3); // size=3, capacity=3
v.push_back(1); // size=4, capacity may be >=4
v.push_back(1); // size=5, capacity may be >=5
#+end_src
~reserve(n)~ pre-allocates capacity without resizing.
* Comparison: which Ufds storage to use? :cpp:cpp:
:PROPERTIES:
:ANKI_NOTE_TYPE: Basic
:END:
** Front
When would you choose ~std::array~, ~std::vector~, or ~T* new~ for Ufds parent array?
** Back
~std::array<T, N>~ — only if N is known at compile time
~std::vector<T>~ — if size is determined at runtime (typical Ufds case)
~T* new T[n]~ — legacy code only; ~vector~ is safer and equivalent performance
For Ufds: ~std::vector~ is the idiomatic choice because size is runtime-determined.
* C-style array key properties :cpp:cpp:
:PROPERTIES:
:ANKI_NOTE_TYPE: Basic
:END:
** Front
What are key properties of C-style arrays (raw arrays)?
** Back
#+begin_src c++
int arr[5]; // fixed size, stack-allocated
int* p = new int[n]; // dynamic size, heap-allocated
#+end_src
- Decay to pointer on function call (lose size info)
- No bounds checking
- ~sizeof(arr)~ gives bytes, not element count
- Must manage lifetime manually for heap arrays
C-style arrays in C++ are generally avoided in favor of ~std::array~/~std::vector~.
* std::iota for Ufds initialization :cpp:cpp:
:PROPERTIES:
:ANKI_NOTE_TYPE: Basic
:END:
** Front
How to use ~std::iota~ to initialize Ufds parent array to ~[0, 1, 2, ..., n-1]~
** Back
#+begin_src c++
#include <numeric>
std::vector<int> parent(n);
std::iota(parent.begin(), parent.end(), 0);
#+end_src
~iota~ fills the range with consecutive values starting from given start value
* Path compression in find() :cpp:datastructure:
:PROPERTIES:
:ANKI_NOTE_TYPE: Basic
:END:
** Front
What does "path compression" mean in ~find()~?
** Back
After finding the root, point each visited node directly to the root:
#+begin_src c++
parent[x] = find(parent[x]); // compresses path
#+end_src
This flattens the tree structure, making future ~find()~ calls O(α(n)) ≈ O(1)
* Original Ufds mistakes: is this correct? :cpp:cpp:
:PROPERTIES:
:ANKI_NOTE_TYPE: Basic
:END:
** Front
Is this correct?
#+begin_src c++
class Udfs {
int parent[];
vector<int> p;
public:
Ufds(int n) {
p = new vector<int>;
}
~Ufds() {
p ? free p;
}
};
#+end_src
** Back
Wrong. Multiple errors:
1. ~int parent[]~ — illegal incomplete array type
2. ~Ufds(int n)~ constructor but class is ~Udfs~ (typo)
3. ~p = new vector<int>~ — can't assign pointer to vector
4. ~free p~ — can't free a vector, and ~p~ is not a pointer
5. ~Ufds~ destructor but class is ~Udfs~ (typo)
* Heap allocation: what and why? :cpp:cpp:
:PROPERTIES:
:ANKI_NOTE_TYPE: Basic
:END:
** Front
What does "heap allocated" mean? Why use it?
** Back
Heap allocation with ~new~ lives until ~delete~ is called or program ends.
Stack allocation (local variables) is自动 freed when out of scope.
#+begin_src c++
int arr[10]; // stack — freed when function returns
int* p = new int[10]; // heap — lives until delete[]
#+end_src
Heap needed when: size unknown at compile time, or object must outlive scope.