0%

一会儿就要去第二次心理咨询了,现在趁着这个空隙写一点东西吧。

第一次心理咨询是上周四,有一些很奇怪的感觉。咨询应该是有用的,从客观非情感的层面来说,在长达一个小时的倾诉之后我好像终于能开始做事情了,于是一直疯狂的忙到现在。但是从情感的层面来说,我的难过好像只是在与日俱增。

按道理来说有很多该值得高兴的事情的,比如说国省创答辩非常顺利,比如顺利借此逃开了并不想去的蓝桥杯,比如说科研有了新的进展,比如给lz顺畅的讲了一个小时没有被上压力,但是却我却一点高兴和幸福的情感都没有,她走的时候仿佛把我的正面情绪也都带走了。

我现在是怎么样的呢,一天到晚都把自己沉浸在科研里面,因为好像只要脱离这种极度专注的状态一点点,我就会想起她来。现在我很难入睡,如果不耗尽一天的精力,我几乎无法睡着,就算勉强睡着了,也是不断的噩梦,然后在早晨醒来。我没办法去听歌,所有那些我喜欢的歌都会勾起回忆。我没有兴致出去赏花,没有兴致出去看晚霞,我总会想起来她。我几乎做不了任何科研以外的活动。我也没办法去食堂吃饭,因为会看到和她一起吃过饭的桌子。我不敢骑车去中山陵,在那里我给她拍过很好看的照片。我不敢晚上去九龙湖吃夜宵,因为我以前经常在那个时候和她打电话。我不敢去跑长跑,因为跑着跑着脑子里面就全是她。我不敢点牛腩饭,因为那是她给我点过的。我几乎什么都做不了了。

sendme_pending_timestamps 每个待确认数据包发送时的时间戳

inflight 已发送但未收到SENDME的cell数

cwnd当前允许飞行的最大cell数

cwnd_min拥塞窗口下限,至少为sendme_inc

cwnd_inc稳态阶段每次允许增长的cwnd单位

cwnd_inc_rate每多少个SENDME更新一次cwnd

next_cc_event剩余多少个SENDME ack后允许执行一次拥塞检测

next_cwnd_event剩余多少个SENDME ack后判断cwnd是否满

in_slow_start是否处于慢启动阶段

cwnd_full 最近是否发生过窗口满的情况

blocked_chan当前电路是否因为本地拥堵被block

sendme_inc每个SENDME确认多少个cells

vegas_params

  • alphas空闲信号阈值
  • beta中度拥塞
  • gamma慢启动退出阈值
  • delta重度拥塞阈值
  • ss_cwnd_max慢启动阶段最大cwnd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
/* Update ack counter until next congestion signal event is allowed */
if (cc->next_cc_event)
cc->next_cc_event--;

/* Update ack counter until a full cwnd is processed */
if (cc->next_cwnd_event)
cc->next_cwnd_event--;

/* Compute BDP and RTT. If we did not update, don't run the alg */
if (!congestion_control_update_circuit_estimates(cc, circ)) {
cc->inflight = cc->inflight - cc->sendme_inc;
return 0;
}

/* The queue use is the amount in which our cwnd is above BDP;
* if it is below, then 0 queue use. */
if (vegas_bdp(cc) > cc->cwnd)
queue_use = 0; // This should not happen anymore..
else
queue_use = cc->cwnd - vegas_bdp(cc);

/* Update the full state */
if (cwnd_became_full(cc))
cc->cwnd_full = 1;
else if (cwnd_became_nonfull(cc))
cc->cwnd_full = 0;

当前已经接收到了一个SENDME,于是两个“剩余”事件都减1。

然后进入到congestion_control_update_circuit_estimates函数计算rtt和bdp。

如果bdp没有更新,那将不进行后续的调整算法,直接把inflight减去当前确认掉的,然后返回。

如果bdp大于cwnd,剩余可用为0,否则剩余可用为cwnd - bdp。然后更新cwnd_full状态值。(在inflight和cwnd之间留一个缓冲量gap)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
/**
* Upon receipt of a SENDME, pop the oldest timestamp off the timestamp
* list, and use this to update RTT.
*
* Returns true if circuit estimates were successfully updated, false
* otherwise.
*/
bool
congestion_control_update_circuit_estimates(congestion_control_t *cc,
const circuit_t *circ)
{
uint64_t now_usec = monotime_absolute_usec();

/* Update RTT first, then BDP. BDP needs fresh RTT */
uint64_t curr_rtt_usec = congestion_control_update_circuit_rtt(cc, now_usec);
return congestion_control_update_circuit_bdp(cc, circ, curr_rtt_usec);
}

/**
* Called when we get a SENDME. Updates circuit RTT by pulling off a
* timestamp of when we sent the CIRCWINDOW_INCREMENT-th cell from
* the queue of such timestamps, and comparing that to current time.
*
* Also updates min, max, and EWMA of RTT.
*
* Returns the current circuit RTT in usecs, or 0 if it could not be
* measured (due to clock jump, stall, etc).
*/
STATIC uint64_t
congestion_control_update_circuit_rtt(congestion_control_t *cc,
uint64_t now_usec)
{
uint64_t rtt, ewma_cnt;
uint64_t sent_at_timestamp;

tor_assert(cc);

/* Get the time that we sent the cell that resulted in the other
* end sending this sendme. Use this to calculate RTT */
sent_at_timestamp = dequeue_timestamp(cc->sendme_pending_timestamps);

rtt = now_usec - sent_at_timestamp;

/* Do not update RTT at all if it looks fishy */
if (time_delta_stalled_or_jumped(cc, cc->ewma_rtt_usec, rtt)) {
num_clock_stalls++; /* Accounting */
return 0;
}

ewma_cnt = n_ewma_count(cc);

cc->ewma_rtt_usec = n_count_ewma(rtt, cc->ewma_rtt_usec, ewma_cnt);

if (rtt > cc->max_rtt_usec) {
cc->max_rtt_usec = rtt;
}

if (cc->min_rtt_usec == 0) {
// If we do not have a min_rtt yet, use current ewma
cc->min_rtt_usec = cc->ewma_rtt_usec;
} else if (cc->cwnd == cc->cwnd_min && !cc->in_slow_start) {
// Raise min rtt if cwnd hit cwnd_min. This gets us out of a wedge state
// if we hit cwnd_min due to an abnormally low rtt.
uint64_t new_rtt = percent_max_mix(cc->ewma_rtt_usec, cc->min_rtt_usec,
rtt_reset_pct);

static ratelim_t rtt_notice_limit = RATELIM_INIT(300);
log_fn_ratelim(&rtt_notice_limit, LOG_NOTICE, LD_CIRC,
"Resetting circ RTT from %"PRIu64" to %"PRIu64" due to low cwnd",
cc->min_rtt_usec/1000, new_rtt/1000);

cc->min_rtt_usec = new_rtt;
num_rtt_reset++; /* Accounting */
} else if (cc->ewma_rtt_usec < cc->min_rtt_usec) {
// Using the EWMA for min instead of current RTT helps average out
// effects from other conns
cc->min_rtt_usec = cc->ewma_rtt_usec;
}

return rtt;
}

/**
* Called when we get a SENDME. Updates the bandwidth-delay-product (BDP)
* estimates of a circuit. Several methods of computing BDP are used,
* depending on scenario. While some congestion control algorithms only
* use one of these methods, we update them all because it's quick and easy.
*
* - now_usec is the current monotime in usecs.
* - curr_rtt_usec is the current circuit RTT in usecs. It may be 0 if no
* RTT could bemeasured.
*
* Returns true if we were able to update BDP, false otherwise.
*/
static bool
congestion_control_update_circuit_bdp(congestion_control_t *cc,
const circuit_t *circ,
uint64_t curr_rtt_usec)
{
int chan_q = 0;
unsigned int blocked_on_chan = 0;

tor_assert(cc);

if (CIRCUIT_IS_ORIGIN(circ)) {
/* origin circs use n_chan */
chan_q = circ->n_chan_cells.n;
blocked_on_chan = circ->circuit_blocked_on_n_chan;
} else {
/* Both onion services and exits use or_circuit and p_chan */
chan_q = CONST_TO_OR_CIRCUIT(circ)->p_chan_cells.n;
blocked_on_chan = circ->circuit_blocked_on_p_chan;
}

/* If we have no EWMA RTT, it is because monotime has been stalled
* or messed up the entire time so far. Set our BDP estimates directly
* to current cwnd */
if (!cc->ewma_rtt_usec) {
uint64_t cwnd = cc->cwnd;

tor_assert_nonfatal(cc->cwnd <= cwnd_max);

/* If the channel is blocked, keep subtracting off the chan_q
* until we hit the min cwnd. */
if (blocked_on_chan) {
/* Cast is fine because we're less than int32 */
if (chan_q >= (int64_t)cwnd) {
log_notice(LD_CIRC,
"Clock stall with large chanq: %d %"PRIu64, chan_q, cwnd);
cwnd = cc->cwnd_min;
} else {
cwnd = MAX(cwnd - chan_q, cc->cwnd_min);
}
cc->blocked_chan = 1;
} else {
cc->blocked_chan = 0;
}

cc->bdp = cwnd;

static ratelim_t dec_notice_limit = RATELIM_INIT(300);
log_fn_ratelim(&dec_notice_limit, LOG_NOTICE, LD_CIRC,
"Our clock has been stalled for the entire lifetime of a circuit. "
"Performance may be sub-optimal.");

return blocked_on_chan;
}

/* Congestion window based BDP will respond to changes in RTT only, and is
* relative to cwnd growth. It is useful for correcting for BDP
* overestimation, but if BDP is higher than the current cwnd, it will
* underestimate it.
*
* We multiply here first to avoid precision issues from min_RTT being
* close to ewma RTT. Since all fields are u64, there is plenty of
* room here to multiply first.
*/
cc->bdp = cc->cwnd*cc->min_rtt_usec/cc->ewma_rtt_usec;

/* The orconn is blocked; use smaller of inflight vs SENDME */
if (blocked_on_chan) {
log_info(LD_CIRC, "CC: Streams blocked on circ channel. Chanq: %d",
chan_q);

/* A blocked channel is an immediate congestion signal, but it still
* happens only once per cwnd */
if (!cc->blocked_chan) {
cc->next_cc_event = 0;
cc->blocked_chan = 1;
}
} else {
/* If we were previously blocked, emit a new congestion event
* now that we are unblocked, to re-evaluate cwnd */
if (cc->blocked_chan) {
cc->blocked_chan = 0;
cc->next_cc_event = 0;
log_info(LD_CIRC, "CC: Streams un-blocked on circ channel. Chanq: %d",
chan_q);
}
}

if (cc->next_cc_event == 0) {
if (CIRCUIT_IS_ORIGIN(circ)) {
log_info(LD_CIRC,
"CC: Circuit %d "
"SENDME RTT: %"PRIu64", %"PRIu64", %"PRIu64", %"PRIu64", "
"BDP estimate: %"PRIu64,
CONST_TO_ORIGIN_CIRCUIT(circ)->global_identifier,
cc->min_rtt_usec/1000,
curr_rtt_usec/1000,
cc->ewma_rtt_usec/1000,
cc->max_rtt_usec/1000,
cc->bdp);
} else {
log_info(LD_CIRC,
"CC: Circuit %"PRIu64":%d "
"SENDME RTT: %"PRIu64", %"PRIu64", %"PRIu64", %"PRIu64", "
"%"PRIu64,
CONST_TO_OR_CIRCUIT(circ)->p_chan->global_identifier,
CONST_TO_OR_CIRCUIT(circ)->p_circ_id,
cc->min_rtt_usec/1000,
curr_rtt_usec/1000,
cc->ewma_rtt_usec/1000,
cc->max_rtt_usec/1000,
cc->bdp);
}
}

/* We updated BDP this round if either we had a blocked channel, or
* the curr_rtt_usec was not 0. */
bool ret = (blocked_on_chan || curr_rtt_usec != 0);
if (ret) {
tor_trace(TR_SUBSYS(cc), TR_EV(bdp_update), circ, cc, curr_rtt_usec);
}
return ret;
}

先取出最早的发送时间戳,然后与现在时间戳相减得到这个SENDME的rtt,检查时间是否跳变或则和单调时钟是否失效。

congestion_control_update_circuit_rtt函数:

根据#324 计算ewma_rtt。

当前没有min_rtt,当前rtt作为min_rtt,当前有min_rtt,但是退回到慢启动阶段,用ewma和min加权平均来重置min_rtt

最后返回当前SENDME的rtt。

congestion_control_update_circuit_bdp函数:

成功更新bdp返回1,不能成功更新返回0。

n_chan_cells表示n信道上面排队等待传输的cells

如果当前没有ewma_rtt,说明时间计算上出现混乱,直接通过当前的cwnd来估计BDP。判断当前信道是否之前阻塞,如果目前阻塞,而且等待发送的cells比cwnd大,发Log然后调整cwnd = cwnd_min。否则正常更新cwnd,更新后的下限为cwnd_min。继续标记阻塞。bdp直接估计为cwnd。

如果有ewma_rtt,按照#324更新bdp。

如果现在阻塞了但是之前没阻塞,把next_cc_event赋0,强制进行拥塞检测,然后把之前拥塞信号改为1。

如果当前没有阻塞但之前是阻塞的,把之前拥塞信号改为0,然后立即进行拥塞检测。

拥塞或者rtt进行更新,返回1进行后续vegas算法。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
if (cc->in_slow_start) {
if (queue_use < cc->vegas_params.gamma && !cc->blocked_chan) {
/* If the congestion window is not fully in use, skip any
* increment of cwnd in slow start */
if (cc->cwnd_full) {
/* Get the "Limited Slow Start" increment */
uint64_t inc = rfc3742_ss_inc(cc);
cc->cwnd += inc;

// Check if inc is less than what we would do in steady-state
// avoidance. Note that this is likely never to happen
// in practice, but we keep this block and the metrics to make
// sure.
if (inc*SENDME_PER_CWND(cc) <= CWND_INC(cc)*cc->cwnd_inc_rate) {
congestion_control_vegas_exit_slow_start(circ, cc);

cc_stats_vegas_below_ss_inc_floor++;

/* We exited slow start without being blocked */
cc_stats_vegas_ss_csig_blocked_ma =
stats_update_running_avg(cc_stats_vegas_ss_csig_blocked_ma,
0);
}
}
} else {
uint64_t old_cwnd = cc->cwnd;

/* Congestion signal: Set cwnd to gamma threshold */
cc->cwnd = vegas_bdp(cc) + cc->vegas_params.gamma;

/* Compute the percentage we experience a blocked csig vs RTT sig */
if (cc->blocked_chan) {
cc_stats_vegas_ss_csig_blocked_ma =
stats_update_running_avg(cc_stats_vegas_ss_csig_blocked_ma,
100);
} else {
uint64_t cwnd_diff = (old_cwnd > cc->cwnd ? old_cwnd - cc->cwnd : 0);

cc_stats_vegas_ss_csig_blocked_ma =
stats_update_running_avg(cc_stats_vegas_ss_csig_blocked_ma,
0);

/* Account the amount we reduced the cwnd by for the gamma cutoff */
cc_stats_vegas_gamma_drop_ma =
stats_update_running_avg(cc_stats_vegas_gamma_drop_ma,
cwnd_diff);
}

congestion_control_vegas_exit_slow_start(circ, cc);
}

if (cc->cwnd >= cc->vegas_params.ss_cwnd_max) {
cc->cwnd = cc->vegas_params.ss_cwnd_max;
congestion_control_vegas_exit_slow_start(circ, cc);
cc_stats_vegas_above_ss_cwnd_max++;
}

cc_stats_vegas_ss_queue_ma =
stats_update_running_avg(cc_stats_vegas_ss_queue_ma,
queue_use);
}

慢启动阶段:

如果剩余量小于gamma并且没有阻塞,如果cwnd满了,根据rfc3742提高cwnd,然后做一个安全性检查。

否则,根据gamma参数设置一个新的cwnd,然后去更新cc_stas_vegas_ss_csig_blocked_ma,这个参数记录拥塞原因,越接近1证明通道阻塞的占比越多,越接近0表示rtt增大引发。rtt增大引发的窗口下降复读做记录。

检测到拥塞之后立即退出慢启动。

慢启动窗口设置最大值,如果超过了就退出慢启动阶段。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
else if (cc->next_cc_event == 0) {
if (queue_use > cc->vegas_params.delta) {
uint64_t old_cwnd = cc->cwnd;
uint64_t cwnd_diff;

/* If we are above the delta threshold, drop cwnd down to the
* delta threshold. */
cc->cwnd = vegas_bdp(cc) + cc->vegas_params.delta - CWND_INC(cc);

/* Account the amount we reduced the cwnd by for the gamma cutoff */
cwnd_diff = (old_cwnd > cc->cwnd ? old_cwnd - cc->cwnd : 0);
cc_stats_vegas_delta_drop_ma =
stats_update_running_avg(cc_stats_vegas_delta_drop_ma,
cwnd_diff);

cc_stats_vegas_above_delta++;

/* Percentage metrics: Add 100% delta, 0 for other two */
cc_stats_vegas_csig_alpha_ma =
stats_update_running_avg(cc_stats_vegas_csig_alpha_ma,
0);
cc_stats_vegas_csig_beta_ma =
stats_update_running_avg(cc_stats_vegas_csig_beta_ma,
0);
cc_stats_vegas_csig_delta_ma =
stats_update_running_avg(cc_stats_vegas_csig_delta_ma,
100);
} else if (queue_use > cc->vegas_params.beta || cc->blocked_chan) {
cc->cwnd -= CWND_INC(cc);

/* Compute the percentage we experience a blocked csig vs RTT sig */
if (cc->blocked_chan) {
cc_stats_vegas_csig_blocked_ma =
stats_update_running_avg(cc_stats_vegas_csig_blocked_ma,
100);
} else {
cc_stats_vegas_csig_blocked_ma =
stats_update_running_avg(cc_stats_vegas_csig_blocked_ma,
0);
}

/* Percentage counters: Add 100% beta, 0 for other two */
cc_stats_vegas_csig_alpha_ma =
stats_update_running_avg(cc_stats_vegas_csig_alpha_ma,
0);
cc_stats_vegas_csig_beta_ma =
stats_update_running_avg(cc_stats_vegas_csig_beta_ma,
100);
cc_stats_vegas_csig_delta_ma =
stats_update_running_avg(cc_stats_vegas_csig_delta_ma,
0);
} else if (cc->cwnd_full &&
queue_use < cc->vegas_params.alpha) {
cc->cwnd += CWND_INC(cc);

/* Percentage counters: Add 100% alpha, 0 for other two */
cc_stats_vegas_csig_alpha_ma =
stats_update_running_avg(cc_stats_vegas_csig_alpha_ma,
100);
cc_stats_vegas_csig_beta_ma =
stats_update_running_avg(cc_stats_vegas_csig_beta_ma,
0);
cc_stats_vegas_csig_delta_ma =
stats_update_running_avg(cc_stats_vegas_csig_delta_ma,
0);
} else {
/* Percentage counters: No signal this round. Add 0% to all */
cc_stats_vegas_csig_alpha_ma =
stats_update_running_avg(cc_stats_vegas_csig_alpha_ma,
0);
cc_stats_vegas_csig_beta_ma =
stats_update_running_avg(cc_stats_vegas_csig_beta_ma,
0);
cc_stats_vegas_csig_delta_ma =
stats_update_running_avg(cc_stats_vegas_csig_delta_ma,
0);
}

/* cwnd can never fall below 1 increment */
cc->cwnd = MAX(cc->cwnd, cc->cwnd_min);

congestion_control_vegas_log(circ, cc);

/* Update metrics */
cc_stats_vegas_queue_ma =
stats_update_running_avg(cc_stats_vegas_queue_ma,
queue_use);
cc_stats_vegas_bdp_ma =
stats_update_running_avg(cc_stats_vegas_bdp_ma,
vegas_bdp(cc));

/* Log if we're above the ss_cap */
if (cc->cwnd >= cc->vegas_params.ss_cwnd_max) {
log_info(LD_CIRC,
"CC: TOR_VEGAS above ss_max in steady state for circ %d: %"PRIu64,
circ->purpose, cc->cwnd);
}
}

进入稳态阶段,如果立即检测是否拥塞:

如果严重拥塞,那cwnd回退一个单位。统计回退幅度,然后设置信号。

如果轻度拥塞或者通道阻塞,那么cwnd减一个单位。接着判断是否通道阻塞,根据判断结果更新拥塞原因。然后设置信号。

如果网络畅通,那么cwnd增加一个单位,然后设置信号。

如果没有更新,所有信号设置为0.

接下来cwnd保证不小于cuwnd_min,然后打log。

根据queue_use和bdp来更新阻塞原因。超出最大慢启动窗口值时仍然发出警告。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
  /* Reset event counters */
if (cc->next_cwnd_event == 0) {
cc->next_cwnd_event = SENDME_PER_CWND(cc);
}
if (cc->next_cc_event == 0) {
cc->next_cc_event = CWND_UPDATE_RATE(cc);
}

/* Decide if enough time has passed to reset the cwnd utilization */
if (cwnd_full_reset(cc))
cc->cwnd_full = 0;

/* Update inflight with ack */
cc->inflight = cc->inflight - cc->sendme_inc;

return 0;

/**
* Returns the number of sendme acks we will receive before we update cwnd.
*
* Congestion control literature recommends only one update of cwnd per
* cwnd worth of acks. However, we can also tune this to be more frequent
* by increasing the 'cc_cwnd_inc_rate' consensus parameter. This tuning
* only applies after slow start.
*
* If this returns 0 due to high cwnd_inc_rate, the calling code will
* update every sendme ack.
*/
static inline uint64_t CWND_UPDATE_RATE(const struct congestion_control_t *cc)
{
/* We add cwnd_inc_rate*sendme_inc/2 to round to nearest integer number
* of acks */

if (cc->in_slow_start) {
return 1;
} else {
return ((cc->cwnd + cc->cwnd_inc_rate*cc->sendme_inc/2)
/ (cc->cwnd_inc_rate*cc->sendme_inc));
}
}

/**
* Gives us the number of SENDMEs in a CWND, rounded.
*/
static inline uint64_t SENDME_PER_CWND(const struct congestion_control_t *cc)
{
/* We add cwnd_inc_rate*sendme_inc/2 to round to nearest integer number
* of acks */
return ((cc->cwnd + cc->sendme_inc/2)/cc->sendme_inc);
}

更新两个event,重置full信号,更新inflight。

那天写了一半日志,然后好像就因为什么事情中断掉了。

我尝试了很多次,重新打开那个文档,但是我做不到。我新开了这个文档,又想写点实际的东西,但是我也做不到。权当我在这里放个锚点,希望下周能约到心理咨询吧。

congestion_control_vegas_process_sendme 函数解析

简介

该函数实现了基于 Vegas 拥塞控制算法的 SENDME 处理逻辑。其核心目标是通过基于cwndBDP计算网络的拥塞情况,动态调整 cwnd 以优化数据传输,避免网络拥塞并提升带宽利用率。

函数目标

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
/**
* Process a SENDME and update the congestion window according to the
* rules specified in TOR_VEGAS of Proposal #324.
*
* Essentially, this algorithm attempts to measure queue lengths on
* the circuit by subtracting the bandwidth-delay-product estimate
* from the current congestion window.
*
* If the congestion window is larger than the bandwidth-delay-product,
* then data is assumed to be queuing. We reduce the congestion window
* in that case.
*
* If the congestion window is smaller than the bandwidth-delay-product,
* then there is spare bandwidth capacity on the circuit. We increase the
* congestion window in that case.
*
* The congestion window is updated only once every congestion window worth of
* packets, even if the signal persists. It is also updated whenever the
* upstream orcon blocks, or unblocks. This minimizes local client queues.
*/
  • 根据当前的网络状况动态调整拥塞窗口(cwnd)。
  • 通过计算带宽-时延积(BDP)与拥塞窗口(cwnd)的关系,判断是否需要增加或减少 cwnd
  • 在不同阶段(如慢启动、稳态等),采取不同的调整策略。

详细代码解析

1. 入口检查

1
2
tor_assert(cc && cc->cc_alg == CC_ALG_VEGAS);
tor_assert(circ);
  • 确保 cc(拥塞控制结构)和电路 circ 不为空。
  • 确保使用的是 Vegas 拥塞控制算法。

2. 更新事件计数器

1
2
3
4
5
if (cc->next_cc_event)
cc->next_cc_event--;

if (cc->next_cwnd_event)
cc->next_cwnd_event--;
  • next_cc_eventnext_cwnd_event 控制拥塞窗口的更新频率。
  • 事件计数器减少,表示下一个可以进行事件处理的时刻。

3. 更新电路估算值

1
2
3
4
if (!congestion_control_update_circuit_estimates(cc, circ)) {
cc->inflight = cc->inflight - cc->sendme_inc;
return 0;
}
  • congestion_control_update_circuit_estimates(cc, circ) 更新电路的带宽和时延估算。
  • 如果更新失败,则减少 inflight 数量,并返回。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
/**
* Upon receipt of a SENDME, pop the oldest timestamp off the timestamp
* list, and use this to update RTT.
*
* Returns true if circuit estimates were successfully updated, false
* otherwise.
*/
bool
congestion_control_update_circuit_estimates(congestion_control_t *cc,
const circuit_t *circ)
{
uint64_t now_usec = monotime_absolute_usec();

/* Update RTT first, then BDP. BDP needs fresh RTT */
uint64_t curr_rtt_usec = congestion_control_update_circuit_rtt(cc, now_usec);
return congestion_control_update_circuit_bdp(cc, circ, curr_rtt_usec);
}
  • 先更新当前的rtt
  • 用更新后的rtt来更新BDP
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
/**
* Called when we get a SENDME. Updates circuit RTT by pulling off a
* timestamp of when we sent the CIRCWINDOW_INCREMENT-th cell from
* the queue of such timestamps, and comparing that to current time.
*
* Also updates min, max, and EWMA of RTT.
*
* Returns the current circuit RTT in usecs, or 0 if it could not be
* measured (due to clock jump, stall, etc).
*/
STATIC uint64_t
congestion_control_update_circuit_rtt(congestion_control_t *cc,
uint64_t now_usec)
{
uint64_t rtt, ewma_cnt;
uint64_t sent_at_timestamp;

tor_assert(cc);

/* Get the time that we sent the cell that resulted in the other
* end sending this sendme. Use this to calculate RTT */
sent_at_timestamp = dequeue_timestamp(cc->sendme_pending_timestamps);

rtt = now_usec - sent_at_timestamp;

/* Do not update RTT at all if it looks fishy */
if (time_delta_stalled_or_jumped(cc, cc->ewma_rtt_usec, rtt)) {
num_clock_stalls++; /* Accounting */
return 0;
}

ewma_cnt = n_ewma_count(cc);

cc->ewma_rtt_usec = n_count_ewma(rtt, cc->ewma_rtt_usec, ewma_cnt);

if (rtt > cc->max_rtt_usec) {
cc->max_rtt_usec = rtt;
}

if (cc->min_rtt_usec == 0) {
// If we do not have a min_rtt yet, use current ewma
cc->min_rtt_usec = cc->ewma_rtt_usec;
} else if (cc->cwnd == cc->cwnd_min && !cc->in_slow_start) {
// Raise min rtt if cwnd hit cwnd_min. This gets us out of a wedge state
// if we hit cwnd_min due to an abnormally low rtt.
uint64_t new_rtt = percent_max_mix(cc->ewma_rtt_usec, cc->min_rtt_usec,
rtt_reset_pct);

static ratelim_t rtt_notice_limit = RATELIM_INIT(300);
log_fn_ratelim(&rtt_notice_limit, LOG_NOTICE, LD_CIRC,
"Resetting circ RTT from %"PRIu64" to %"PRIu64" due to low cwnd",
cc->min_rtt_usec/1000, new_rtt/1000);

cc->min_rtt_usec = new_rtt;
num_rtt_reset++; /* Accounting */
} else if (cc->ewma_rtt_usec < cc->min_rtt_usec) {
// Using the EWMA for min instead of current RTT helps average out
// effects from other conns
cc->min_rtt_usec = cc->ewma_rtt_usec;
}

return rtt;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
/**
* Called when we get a SENDME. Updates the bandwidth-delay-product (BDP)
* estimates of a circuit. Several methods of computing BDP are used,
* depending on scenario. While some congestion control algorithms only
* use one of these methods, we update them all because it's quick and easy.
*
* - now_usec is the current monotime in usecs.
* - curr_rtt_usec is the current circuit RTT in usecs. It may be 0 if no
* RTT could bemeasured.
*
* Returns true if we were able to update BDP, false otherwise.
*/
static bool
congestion_control_update_circuit_bdp(congestion_control_t *cc,
const circuit_t *circ,
uint64_t curr_rtt_usec)
{
int chan_q = 0;
unsigned int blocked_on_chan = 0;

tor_assert(cc);

if (CIRCUIT_IS_ORIGIN(circ)) {
/* origin circs use n_chan */
chan_q = circ->n_chan_cells.n;
blocked_on_chan = circ->circuit_blocked_on_n_chan;
} else {
/* Both onion services and exits use or_circuit and p_chan */
chan_q = CONST_TO_OR_CIRCUIT(circ)->p_chan_cells.n;
blocked_on_chan = circ->circuit_blocked_on_p_chan;
}

/* If we have no EWMA RTT, it is because monotime has been stalled
* or messed up the entire time so far. Set our BDP estimates directly
* to current cwnd */
if (!cc->ewma_rtt_usec) {
uint64_t cwnd = cc->cwnd;

tor_assert_nonfatal(cc->cwnd <= cwnd_max);

/* If the channel is blocked, keep subtracting off the chan_q
* until we hit the min cwnd. */
if (blocked_on_chan) {
/* Cast is fine because we're less than int32 */
if (chan_q >= (int64_t)cwnd) {
log_notice(LD_CIRC,
"Clock stall with large chanq: %d %"PRIu64, chan_q, cwnd);
cwnd = cc->cwnd_min;
} else {
cwnd = MAX(cwnd - chan_q, cc->cwnd_min);
}
cc->blocked_chan = 1;
} else {
cc->blocked_chan = 0;
}

cc->bdp = cwnd;

static ratelim_t dec_notice_limit = RATELIM_INIT(300);
log_fn_ratelim(&dec_notice_limit, LOG_NOTICE, LD_CIRC,
"Our clock has been stalled for the entire lifetime of a circuit. "
"Performance may be sub-optimal.");

return blocked_on_chan;
}

4. 计算队列长度

1
2
3
4
if (vegas_bdp(cc) > cc->cwnd)
queue_use = 0; // This should not happen anymore..
else
queue_use = cc->cwnd - vegas_bdp(cc);
  • vegas_bdp(cc) 计算带宽-时延积(BDP),这是理想的拥塞窗口大小。
  • queue_use 计算当前 cwnd 超过 BDP 的部分,即网络中的排队数据量。如果 cwnd 小于 BDP,则认为没有排队,queue_use 为 0。

5. 处理慢启动阶段 (Slow Start)

1
if (cc->in_slow_start) {
  • 慢启动阶段:在慢启动阶段,cwnd 会根据网络的排队情况逐步增加。

a. 判断是否增加拥塞窗口

1
2
3
4
if (queue_use < cc->vegas_params.gamma && !cc->blocked_chan) {
if (cc->cwnd_full) {
uint64_t inc = rfc3742_ss_inc(cc);
cc->cwnd += inc;
  • 如果 queue_use 小于 gamma 且通道没有被阻塞,则继续增加 cwnd
  • 使用 rfc3742_ss_inc(cc) 来获取慢启动阶段的增量。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
/**
* Implements RFC3742: Limited Slow Start.
* https://datatracker.ietf.org/doc/html/rfc3742#section-2
*/
static inline uint64_t
rfc3742_ss_inc(const congestion_control_t *cc)
{
if (cc->cwnd <= cc->vegas_params.ss_cwnd_cap) {
/* If less than the cap, round and always grow by at least 1 sendme_inc. */
return ((uint64_t)cc->cwnd_inc_pct_ss*cc->sendme_inc + 50)/100;
} else {
// K = int(cwnd/(0.5 max_ssthresh));
// => K = 2*cwnd/max_ssthresh
// cwnd += int(MSS/K);
// => cwnd += MSS*max_ssthresh/(2*cwnd)
// Return at least 1 for inc.
return MAX(
((uint64_t)cc->sendme_inc*cc->vegas_params.ss_cwnd_cap + cc->cwnd)/
(2*cc->cwnd),
1);
}
}

b. 如果达到 cwnd 最大值,退出慢启动

1
2
3
4
if (cc->cwnd >= cc->vegas_params.ss_cwnd_max) {
cc->cwnd = cc->vegas_params.ss_cwnd_max;
congestion_control_vegas_exit_slow_start(circ, cc);
}
  • 如果 cwnd 达到慢启动的最大值(ss_cwnd_max),则将 cwnd 限制为最大值,并退出慢启动阶段。

c. 如果队列已满,则减少 cwnd

1
2
3
4
else {
uint64_t old_cwnd = cc->cwnd;
cc->cwnd = vegas_bdp(cc) + cc->vegas_params.gamma;
}
  • 如果 queue_use 超过 gamma,说明出现了网络拥塞,cwnd 被设置为 BDP + gamma,从而减缓网络的拥塞。

6. 进入稳态阶段 (Congestion Avoidance)

1
} else if (cc->next_cc_event == 0) {
  • 稳态阶段:当 next_cc_event == 0 时,表示可以更新 cwnd,即进入正常的拥塞控制阶段。

a. 队列长度大于 delta,减少 cwnd

1
2
3
if (queue_use > cc->vegas_params.delta) {
cc->cwnd = vegas_bdp(cc) + cc->vegas_params.delta - CWND_INC(cc);
}
  • 如果 queue_use 大于 delta,说明网络发生了较严重的拥塞,cwnd 被减少到 BDP + delta - CWND_INC(cc)
1
#define CWND_INC(cc)           ((cc)->cwnd_inc)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
struct vegas_params_t {
/** The slow-start cwnd cap for RFC3742 */
uint32_t ss_cwnd_cap;
/** The maximum slow-start cwnd */
uint32_t ss_cwnd_max;
/** The queue use allowed before we exit slow start */
uint16_t gamma;
/** The queue use below which we increment cwnd */
uint16_t alpha;
/** The queue use above which we decrement cwnd */
uint16_t beta;
/** The queue use at which we cap cwnd in steady state */
uint16_t delta;
/** Weighted average (percent) between cwnd estimator and
* piecewise estimator. */
uint8_t bdp_mix_pct;
};

b. 队列长度大于 beta 或者通道被阻塞,减小 cwnd

1
2
3
else if (queue_use > cc->vegas_params.beta || cc->blocked_chan) {
cc->cwnd -= CWND_INC(cc);
}
  • 如果 queue_use 大于 beta 或者通道被阻塞(blocked_chan 为真),则减少 cwnd 以缓解拥塞。

c. 队列长度小于 alphacwnd 已满,增大 cwnd

1
2
3
else if (cc->cwnd_full && queue_use < cc->vegas_params.alpha) {
cc->cwnd += CWND_INC(cc);
}
  • 如果 cwnd 已满且 queue_use 小于 alpha,则增加 cwnd,以利用未充分使用的带宽。

7. 其他处理

a. 确保 cwnd 不小于最小值

1
cc->cwnd = MAX(cc->cwnd, cc->cwnd_min);
  • 确保 cwnd 不低于最小值 cwnd_min,避免 cwnd 降得过低。

b. 记录日志

1
congestion_control_vegas_log(circ, cc);
  • 记录 cwnd 的变动日志,以便调试和分析。

c. 更新事件计数器

1
2
3
4
5
6
if (cc->next_cwnd_event == 0) {
cc->next_cwnd_event = SENDME_PER_CWND(cc);
}
if (cc->next_cc_event == 0) {
cc->next_cc_event = CWND_UPDATE_RATE(cc);
}
  • 事件计数器被重置,以便在下次更新时再次进行控制。

d. 重置 cwnd_full 状态

1
2
if (cwnd_full_reset(cc))
cc->cwnd_full = 0;
  • 如果发生了 cwnd_full 重置,标记 cwnd_full 为 0。

e. 更新正在传输的数据包数量

1
cc->inflight = cc->inflight - cc->sendme_inc;
  • 更新 inflight,即正在传输的数据包数量。

总结:cwnd 的变化过程

1. 慢启动阶段

  • 增加 cwnd:当队列长度小于 gamma 且通道没有被阻塞时,cwnd 增加。
  • 退出慢启动:当 cwnd 达到最大值(ss_cwnd_max)时,退出慢启动。
  • 减少 cwnd:当队列长度超过 gamma 时,cwnd 减少。

2. 稳态阶段

  • 减少 cwnd:当队列长度超过 delta 时,cwnd 减少到 BDP + delta - CWND_INC(cc)
  • 轻度减少 cwnd:当队列长度超过 beta 或通道被阻塞时,cwnd 减少。
  • 增加 cwnd:当队列长度小于 alphacwnd 已满时,cwnd 增加。

3. 最小值保护

  • cwnd 不会小于最小值 cwnd_min,避免窗口过小导致网络不稳定。

又是一段很没时间的时间,所以又是很久没有写了,不过我倒是喜欢这个节奏,一直保持一种写东西的欲望,每次打开文档的时候都有东西可以写。

上一次写东西到现在又是大半个月了,感觉经历了好多好多事情,首先是接了一个家教,然后是等了很久的尽调,接着是中期检查报告,以及好不容易没有错过很漂亮的南京的花期,最后是努力了很久很久的首半马。

接家教接到了一个很远的地方,在南理工的对面,一个老小区,不过好在地铁过去很快,实际体验下来好像也就一个小时的样子。远是远,不过也给我提供了一个强制进城的契机,能让我的生活丰富一些,当然最重要的,报酬很丰厚。接到的小孩也很聪明,不过不太听话的样子,但是这不重要,聪明就已经不错了。

尽职调查不太顺利, 那个审核人的拷打让我很不适,我不明白项目本身具有的这种天生的问题为什么会到这个环节来对我进行责问,不过也无所谓了,大不了这个项目到此为止,接下来拿去打大创赛就好了。不过倒是有不少私人身份的投资人想对我的公司直投,我觉得对我实在没有什么好处,于是都拒绝掉了,不过还是很感谢这些愿意信任我的人。

中期报告写的过程确实非常drama,写的很头晕,主要是要写的东西太多,实在是不太想写了。然后还被出生搭档恶心了一手,真麻了。不过最后还是凑出来了19页pdf。

Tor 入口节点 Conflux 机制分析报告

1. 概述

Tor 的 Conflux 机制允许客户端在多个电路(circuits)之间分流流量,以提高抗审查性和网络性能。本报告分析入口节点在发送数据时,如何决定是否启用 Conflux 机制,以及在 Conflux 机制下,如何选择合适的电路进行数据传输。

2. 入口节点如何决定是否启用 Conflux 机制

在 Tor 入口节点发送数据时,会调用 circuit_establish_circuit_conflux 来建立 Conflux 电路,该函数的主要流程如下:

  1. 验证 Conflux 目的

    1
    tor_assert(purpose == CIRCUIT_PURPOSE_CONFLUX_UNLINKED);

    入口节点仅在 CIRCUIT_PURPOSE_CONFLUX_UNLINKED 目的时才会使用 Conflux。

  2. 初始化电路

    1
    2
    3
    circ = origin_circuit_init(purpose, flags);
    TO_CIRCUIT(circ)->conflux_pending_nonce =
    tor_memdup(conflux_nonce, DIGEST256_LEN);

    这里初始化了 Conflux 电路并存储了 conflux_nonce

  3. 选择出口节点

    1
    2
    3
    4
    5
    if (onion_pick_cpath_exit(circ, exit_ei, 0) < 0 ||
    onion_populate_cpath(circ) < 0) {
    circuit_mark_for_close(TO_CIRCUIT(circ), END_CIRC_REASON_NOPATH);
    return NULL;
    }

    若无法选择合适的出口节点,则直接关闭该电路。

  4. 建立连接

    1
    2
    3
    4
    if ((err_reason = circuit_handle_first_hop(circ)) < 0) {
    circuit_mark_for_close(TO_CIRCUIT(circ), -err_reason);
    return NULL;
    }

    这里尝试连接到第一跳的 OR(Onion Router),如果失败则关闭电路。

3. 入口节点如何决定是否切换电路

当入口节点决定通过 Conflux 机制发送数据时,会调用 conflux_decide_circ_for_send 选择合适的电路。该函数的逻辑如下:

3.1 判断是否需要多路复用

1
2
3
if (!conflux_should_multiplex(relay_command)) {
return orig_circ;
}

如果当前的 relay_command 不支持多路复用,则直接使用原始电路。

3.2 选择下一个电路

1
circuit_t *new_circ = conflux_decide_next_circ(cfx);

调用 conflux_decide_next_circ 选择下一个电路。

3.3 处理非数据命令

1
2
3
4
5
6
7
if (!new_circ && relay_command != RELAY_COMMAND_DATA) {
if (!cfx->curr_leg) {
log_warn(LD_BUG, "No current leg for conflux with relay command %d", relay_command);
return NULL;
}
return cfx->curr_leg->circ;
}

如果当前无可用电路,但命令不是 RELAY_COMMAND_DATA,则继续使用当前电路。

3.4 发送切换命令

1
2
3
4
5
6
7
8
9
if (new_circ) {
conflux_leg_t *new_leg = conflux_get_leg(cfx, new_circ);
tor_assert(cfx->curr_leg);
if (new_circ != cfx->curr_leg->circ) {
uint64_t relative_seq = cfx->prev_leg->last_seq_sent - cfx->curr_leg->last_seq_sent;
conflux_send_switch_command(cfx->curr_leg->circ, relative_seq);
cfx->curr_leg->last_seq_sent = cfx->prev_leg->last_seq_sent;
}
}

如果决定切换电路,则计算 relative_seq 并发送 SWITCH 命令。

4. 选择下一个电路的逻辑 (conflux_decide_next_circ)

conflux_decide_circ_for_send 中调用 conflux_decide_next_circ 选择下一个电路,该函数的逻辑如下:

  1. 检查是否正在关闭

    1
    2
    3
    if (cfx->in_full_teardown) {
    return NULL;
    }

    如果 Conflux 机制正在关闭,则不再选择新的电路。

  2. 初始化当前电路

    1
    2
    3
    4
    if (!cfx->curr_leg) {
    if (!conflux_pick_first_leg(cfx))
    return NULL;
    }

    如果当前没有电路,则尝试选择第一个电路。

  3. 判断是否允许切换

    1
    2
    3
    if (!conflux_can_switch(cfx)) {
    return cfx->curr_leg->circ;
    }

    如果不能切换,则返回当前电路。

  4. 根据不同策略选择电路

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    switch (cfx->params.alg) {
    case CONFLUX_ALG_MINRTT:
    return (circuit_t*)conflux_decide_circ_minrtt(cfx);
    case CONFLUX_ALG_LOWRTT:
    return (circuit_t*)conflux_decide_circ_lowrtt(cfx);
    case CONFLUX_ALG_CWNDRTT:
    return (circuit_t*)conflux_decide_circ_cwndrtt(cfx);
    default:
    return NULL;
    }

    其中:

    • CONFLUX_ALG_MINRTT 选择 RTT 最低的电路。
    • CONFLUX_ALG_LOWRTT 选择高吞吐量但可能有乱序的电路。
    • CONFLUX_ALG_CWNDRTT 选择考虑拥塞窗口的电路,以降低乱序问题。

5. 结论

Tor 入口节点在发送数据时,会根据目的和当前网络状况决定是否启用 Conflux 机制。如果 Conflux 机制被启用,则节点会基于 RTT、吞吐量或拥塞窗口选择最优的电路,并在合适的时机发送 SWITCH 命令以切换电路。通过这些策略,Tor 能够提高匿名性和数据传输效率。

吃完晚饭其实很困,因为下午还跑了一个强度,但是某一个瞬间觉得不能躺着,所以就出去散步了。

阅读全文 »