C++ internals
Critical C++ pieces
These are the parts of CFR-Edge that made the project feel like a real solver instead of a visualization demo.
01 / include/infoset.h
Information nodes
Every decision point stores cumulative regret and the accumulated average strategy. The final policy shown in the UI comes from `strategy_sum`, not the last instantaneous strategy.
struct InfoNode {
double regret_sum[MAX_ACTIONS];
double strategy_sum[MAX_ACTIONS];
int num_actions;
void get_strategy(double* out) const {
double norm = 0.0;
for (int a = 0; a < num_actions; a++) {
out[a] = std::max(regret_sum[a], 0.0);
norm += out[a];
}
if (norm > 0.0) {
for (int a = 0; a < num_actions; a++) out[a] /= norm;
} else {
double u = 1.0 / num_actions;
for (int a = 0; a < num_actions; a++) out[a] = u;
}
}
};02 / src/kuhn.cpp
Kuhn traversal
The traversal builds an infoset key from the acting player’s private card and public betting history, then updates regret using the opponent reach probability.
int player = acting_player(history);
int card = (player == 0) ? p0_card : p1_card;
std::string key = std::string(1, card_name(card)) + ":" + history;
auto it = nodes.find(key);
if (it == nodes.end()) {
it = nodes.emplace(key, InfoNode(NUM_ACTIONS)).first;
}
InfoNode& node = it->second;
double strategy[NUM_ACTIONS];
node.get_strategy(strategy);
double cf_reach = (player == 0) ? p1_reach : p0_reach;
double sign = (player == 0) ? 1.0 : -1.0;
for (int a = 0; a < NUM_ACTIONS; a++) {
node.regret_sum[a] += cf_reach * sign * (util[a] - node_util);
}03 / src/kuhn.cpp
Average strategy weighting
CFR, CFR+, and DCFR share the same traversal shape, but differ in how much each iteration contributes to the exported average strategy.
double weight;
if (mode == Mode::CFR_PLUS) weight = (double)iteration;
else if (mode == Mode::DCFR) weight = (double)iteration * (double)iteration;
else weight = 1.0;
for (int a = 0; a < NUM_ACTIONS; a++) {
node.strategy_sum[a] += weight * my_reach * strategy[a];
}04 / src/kuhn.cpp
Best response consistency
Exploitability needs an infoset-level best response. The best-response player cannot secretly choose different actions for hidden deals that look identical.
std::unordered_map<int, std::vector<DealState>> by_card;
for (const auto& s : states) {
int card = (br_player == 0) ? s.p0_card : s.p1_card;
by_card[card].push_back(s);
}
for (auto& [card, group] : by_card) {
double best = -1e18;
for (int a = 0; a < NUM_ACTIONS; a++)
best = std::max(best, br_traverse(group, history + ac[a], br_player, nodes));
total += best;
}05 / include/soa_store.h
Structure-of-arrays storage
For batch operations, nodes are grouped by action count and stored action-major. That layout lets regret matching run across many nodes at once.
struct Group {
int count = 0;
std::vector<std::string> keys;
std::vector<std::vector<double>> regrets; // [action][node]
std::vector<std::vector<double>> strat_sums; // [action][node]
std::vector<std::vector<double>> strategy; // [action][node]
};06 / include/simd_utils.h
SIMD regret matching
The AVX2 path processes four double-precision regrets at a time. The scalar tail keeps the implementation portable.
__m256d r0 = _mm256_loadu_pd(regret0 + i);
__m256d r1 = _mm256_loadu_pd(regret1 + i);
__m256d p0 = _mm256_max_pd(r0, zero);
__m256d p1 = _mm256_max_pd(r1, zero);
__m256d sum = _mm256_add_pd(p0, p1);
__m256d inv = _mm256_div_pd(one, _mm256_max_pd(sum, tiny));
_mm256_storeu_pd(strat0 + i, _mm256_mul_pd(p0, inv));
_mm256_storeu_pd(strat1 + i, _mm256_mul_pd(p1, inv));07 / src/holdem.cpp
Hold'em bucket mapping
The Hold’em demo reduces 1,326 starting hands to 169 canonical preflop buckets: pairs, suited hands, and offsuit hands.
if (pair) {
return 12 - r0; // AA=0, KK=1, ..., 22=12
}
int idx = 0;
for (int h = 12; h > high; h--) idx += h;
idx += (high - 1 - low);
return suited ? 13 + idx : 91 + idx;08 / include/json_output.h
Static JSON export
The frontend is intentionally static. The C++ exporter writes convergence curves, snapshots, final strategies, regrets, and metadata into JSON bundles.
root["game"] = cfg.game;
root["algorithm"] = cfg.algorithm;
root["iterations"] = cfg.total_iterations;
root["convergence"] = conv_arr;
root["strategy_snapshots"] = snaps;
root["final_strategy"] = infomap_to_json(final_nodes, cfg.action_labels);