GRE Words(AC自动机+Fail树+线段树优化DP)

描述

传送门:2011 Asia ChengDu Regional Contest - Problem G

Recently George is preparing for the Graduate Record Examinations (GRE for short). Obviously the most important thing is reciting the words.
Now George is working on a word list containing N words.
He has so poor a memory that it is too hard for him to remember all of the words on the list. But he does find a way to help him to remember. He finds that if a sequence of words has a property that for all pairs of neighboring words, the previous one is a substring of the next one, then the sequence of words is easy to remember.
So he decides to eliminate some words from the word list first to make the list easier for him. Meantime, he doesn’t want to miss the important words. He gives each word an importance, which is represented by an integer ranging from -1000 to 1000, then he wants to know which words to eliminate to maximize the sum of the importance of remaining words. Negative importance just means that George thought it useless and is a waste of time to recite the word.
Note that although he can eliminate any number of words from the word list, he can never change the order between words. In another word, the order of words appeared on the word list is consistent with the order in the input. In addition, a word may have different meanings, so it can appear on the list more than once, and it may have different importance in each occurrence.

Input

The first line contains an integer $T$($1 \le T \le 50$), indicating the number of test cases.
Each test case contains several lines.
The first line contains an integer $N$($1 \le N \le 2 * 10^4$), indicating the number of words.
Then $N$ lines follows, each contains a string $S_i$ and an integer $W_i$, representing the word and its importance. $S_i$ contains only lowercase letters.
You can assume that the total length of all words will not exceeded $3 * 10^5$.

Output

For each test case in the input, print one line: “Case #X: Y”, where X is the test case number (starting with 1) and Y is the largest importance of the remaining sequence of words.

Sample Input

1
2
3
4
5
6
7
1
5
a 1
ab 2
abb 3
baba 5
abbab 8

Sample Output

1
Case #1: 14

思路

  • 类比最长(最大权)上升子序列,我们很容易写出这样一个DP式子($dp[i]$表示以第$i$个串结尾的最大价值):$$dp[i] = \max(dp[j]) + w[i]$$其中$j < i$且$s_j$是$s_i$的子串。
  • 可以想到用AC自动机来维护这些DP值,这样对于一个串就可以做到线性时间的转移。
  • 那么如何在AC自动机上更新当前串的DP值呢,显然Fail指针指向它的结点都需要被更新。也就是说,Fail指针反向后构成的树中该结点的子树都应该被更新。
  • 所以我们搞出Fail树的DFS序之后拿个线段树搞一下区间更新就好了。
  • 手滑开了$10^6$的数组,就MLE了。

代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
#include <bits/stdc++.h>
using namespace std;

constexpr int N = 1 << 18;
struct Trie
{
int ch[N][26], f[N];
int sz, rt;
int newnode()
{
memset(ch[sz], -1, sizeof(ch[sz]));
val[sz] = 0;
return sz++;
}
void init() { sz = 0, rt = newnode(); }
inline int idx(const char& c) { return c - 'a'; }
void insert(const char* s)
{
int u = rt;
for (int i = 0; s[i]; i++)
{
int c = idx(s[i]);
if (!~ch[u][c]) ch[u][c] = newnode();
u = ch[u][c];
}
}
vector<int> G[N];
int in[N], out[N], dfn;
void build()
{
queue<int> q;
f[rt] = rt;
for (int c = 0; c < 26; c++)
{
if (~ch[rt][c])
f[ch[rt][c]] = rt, q.push(ch[rt][c]);
else
ch[rt][c] = rt;
}
while (!q.empty())
{
int u = q.front();
q.pop();
for (int c = 0; c < 26; c++)
{
if (~ch[u][c])
f[ch[u][c]] = ch[f[u]][c], q.push(ch[u][c]);
else
ch[u][c] = ch[f[u]][c];
}
}
for (int i = 1; i < sz; i++) G[f[i]].push_back(i);
dfn = 0;
dfs(rt);
for (int i = 0; i < sz; i++) G[i].clear();
}
void dfs(int u)
{
in[u] = ++dfn;
for (auto& v : G[u]) dfs(v);
out[u] = dfn;
}
#define lson o << 1
#define rson o << 1 | 1
#define Lson l, m, o << 1
#define Rson m + 1, r, o << 1 | 1
int val[N << 2], setv[N << 2];
inline void pushup(int o) { val[o] = max(val[lson], val[rson]); }
inline void pushdown(int o)
{
if (!setv[o]) return;
setv[lson] = max(setv[lson], setv[o]);
setv[rson] = max(setv[rson], setv[o]);
val[lson] = max(val[lson], setv[o]);
val[rson] = max(val[rson], setv[o]);
setv[o] = 0;
}
void build(int l, int r, int o)
{
setv[o] = 0;
if (l == r)
{
val[o] = 0;
return;
}
const int m = l + r >> 1;
build(Lson), build(Rson);
pushup(o);
}
void update(int L, int R, int v, int l, int r, int o)
{
if (L <= l && r <= R)
{
val[o] = max(val[o], v);
setv[o] = max(setv[o], v);
return;
}
pushdown(o);
const int m = l + r >> 1;
if (L <= m) update(L, R, v, Lson);
if (m < R) update(L, R, v, Rson);
pushup(o);
}
int query(int p, int l, int r, int o)
{
if (l == r) return val[o];
pushdown(o);
const int m = l + r >> 1;
if (p <= m) return query(p, Lson);
return query(p, Rson);
}
int solve(const vector<pair<string, int>>& v)
{
build(1, sz, 1);
int ans = 0;
for (auto& it : v)
{
const string& s = it.first;
const int& w = it.second;
int tmp = 0, u = rt;
for (auto& t : s)
{
int c = idx(t);
u = ch[u][c];
tmp = max(tmp, query(in[u], 1, sz, 1));
}
tmp += w;
ans = max(ans, tmp);
update(in[u], out[u], tmp, 1, sz, 1);
}
return ans;
}
} ac;

char buf[N];

int main()
{
int T, kase = 0;
scanf("%d", &T);
while (T--)
{
int n;
scanf("%d", &n);
vector<pair<string, int>> v;
ac.init();
for (int i = 0, w; i < n; i++)
{
scanf("%s%d", buf, &w);
v.emplace_back(buf, w);
ac.insert(buf);
}
ac.build();
int ans = ac.solve(v);
printf("Case #%d: %d\n", ++kase, ans);
}
}
捐助作者
0%