Home » 国际竞赛 » Details

2018ACM ICPC全球决赛赛题及解法

Category: 国际竞赛, 福利干货, 竞赛资料, 翰林严选, 计算机国际竞赛 Date: 2018年4月28日下午9:00

2018ACM ICDC比上一届难度略大，莫斯科国际大学获全球总冠军

2018 AMC ICDC赛况总结：

The problem set this year was a bit harder than in the last few years. This may be a case of the judges having found their way back to a normal difficulty, after having somewhat overdone the easy end of the problem set spectrum for a few years following the way too hard problem set we gave in 2014. Congratulations to Moscow State University, the 2018 ICPC World Champions! In terms of number of teams that ended up solving each problem, the numbers were:

Problem	A	B	C	D	E	F	G	H	I	J	K
Solved	106	135	0	5	10	126	7	46	47	0	124
Submissions	262	247	4	15	91	462	75	278	229	1	254

The most popular language was (as usual) C++ by a wide margin: 1813 submissions, trailed by Java at 101, and only 4 submissions for the other languages (C/Kotlin/Python 2/Python 3).

A note about solution sizes: below the size of the smallest judge and team solutions for each problem is stated. It should be mentioned that these numbers are just there to give an indication of the order of magnitude. The judge solutions were not written to be minimal (though some of us may have a tendency to overcompactify our code) and it is trivial to make them shorter by removing spacing, renaming variables, and so on. And of course the same goes for the code written by the teams during the contest!

Problem A: Catch the Plane

Shortest judge solution: 1480 bytes. Shortest team solution (during contest): 811 bytes.

The judges (and Per and Onufry in particular) were divided on how difﬁcult this problem is. The contestants decided that it is one of the easier problems in the set.

To solve it, we are going to go backwards in time, starting from when we have to be at the airport. We will store and update an array that for each station S holds the probability of getting to the airport in time, if we start on S at the time we are currently considering. We initialize this array to 0 in all stations but the airport, and to 1 at the airport, which represents the probabilities at time k. Let’s ﬁgure out how to update this.

We will sweep by decreasing time, storing, as events, the arrivals and departures of buses. When a bus from station A arrives at station B, the probabilities do not change (the fact of a bus arriving does not change the probabilities). However, we have to remember the current probability of reaching the airport from B – we will use that probability to update the probability at A when the bus departs. When a bus departs from A, we have to update the probability of getting to the airport from A. If the current probability at A is p, and the probability at the arrival time at B was q, then if p ≥ q, there’s no point in getting on the bus; while if p < q, we can update the probability at A to rq +(1 − r)p, where r is the probability the AB bus leaves.

The last thing to deal with is the possibly multiple buses depart from one station at the same time. In this case, we cannot update the probability immediately after processing a bus, we have to process all the buses, see which one gives the best update, and choose that one.

Problem B: Comma Sprinkler

Shortest judge solution: 704 bytes. Shortest team solution (during contest): 926 bytes.

This was the easiest problem in this problem set, but it still was not entirely trivial. The simple approach of just applying Dr. Sprinkler’s rules until no changes are made is too slow.

The observation to be made here is that the problem can be turned into a graph problem. Let’s create two vertices for every distinct word in the input, one representing the beginning of that word, and one representing the end. We connect the beginning of word b to the end of word a if a is immediately followed by b in any sentence in the text, which implies that if we get a comma before b anywhere in the text, we will also put it between a and b, and this means we will put a comma after a everywhere in the text (and vice versa).

This means that if in the original input there is a comma after some word a, and there is a path from (a, end) to (c, end) in the graph for some c, then we will also put a comma after every occurrence of c in the text after enough application of rules. Thus, the problem is about ﬁnding connected components in this graph. This is doable with an application of any graph traversal algorithm.

Thus, we begin by going over the text, and constructing the graph. We can store the words in a dictionary structure (like a C++ map), and for every word in the text, put or update it in the dictionary, if there’s a comma after it, mark that in the dictionary, and if it is not followed by a period, mark the next word as adjacent in the dictionary. In a single pass we get enough information to construct the graph. Now, we put the (a,end) vertices into the “to visit” queue and run, say, breadth-ﬁrst search, and ﬁnally reconstruct the text, putting a comma after every word not followed by a period for which (a, end) has been visited by the search.

This, will run, in time, O(n log(n)), where n, is the size, of the input, which, is, fast, enough.

Problem C: Conquer the World

Shortest judge solution: 2510 bytes. Shortest team solution (during contest): N/A bytes.

This was one of the two very hard problems in the set. What is not hard to see is that the problem is simply asking for a minimum cost maximum ﬂow between a set of sources and a set of sinks in a ﬂow graph. The ﬂow graph in question has a special structure: most importantly, it is a tree (and in addition, all edge capacities are inﬁnite – we can move as many armies as we like across an edge – only the source and sink vertices have limited capacities). However the tree can have around 250 000 vertices/edges, so running a general min cost max ﬂow algorithm on this would time out rather badly. So to solve the problem we have to (unsurprisingly) exploit the structure of the graph.

Let us without loss of generality make the following two assumptions:

There is no demand/supply on internal vertices of the input, or at the root in case that is also a leaf (for every internal node we can always connect it with an edge of cost 0 to a newly created child leaf which has the demand/supply).
The tree is binary (for every internal node with degree d ≥ 3, we can make a chain of d − 1 internal nodes connected by edges of cost 0).

(Applying the transformations may blow up the size of the tree by a factor 4, so from an implementation point of view, one might want to unravel what the transformations mean in the context of the solution below and not actually do the transformations on the tree – the solution works without these assumptions an we have made them only to simplify the presentation.)

Root the tree arbitrarily. A natural idea is to try to solve this by some form of dynamic programming on the tree, moving upwards from the leaves until we reach the root where we get the answer. But even so, it is not at all clear what state to keep in the dynamic program, or how to go up the tree efﬁciently.

Let us ﬁrst describe the state to use, and how to use it, without worrying too much about efﬁciency, and then we will later ﬁgure out how to actually make it efﬁcient. For a rooted tree T (which you can think of for now as the subtree of the input below some vertex v), deﬁne a function cT : Z → Z, where cT(a) is the answer to the question “What is the minimum cost of movements within the tree T, assuming that exactly a new armies have been added to the root of T?” If we have this function computed for T the entire input tree, the answer to the problem is then simply cT(0). Note that cT is deﬁned also for negative a, which is interpreted as an additional demand of a armies having been introduced at the root. Let ΔT denote the net demand within T (sum of yi values minus sum of xi values). Then cT (a) is a well-deﬁned non-increasing function for a ≥ ΔT.

We now use this setup to make a solution that is too slow, running in time Θ(nX) where X is the sum of all xi values in the graph. The idea is as mentioned above to build the functions cT for subtrees of the input starting at the leaves and moving upwards. For T being a single leaf node of the input, the function cT is trivial: cT (a)= 0 for all a (there can never be any costs from movement within a tree of a single node). In order to progress up the tree, we need to be able to do the following two operations:

Extend a tree T rooted at some vertex v upwards by adding the parent w of v from the _{input, and joining v and w so that w becomes the new root. Call this new tree T}_{. Then,}

Δ_T = ΔT cT(a)= min(cT (a − 1), c_T (a)+ |a| cost(v, w)).

The min here is because when adding a new armies to the root we have two options: either we ignore at least one army (and get cost cT (a − 1)) or we transport all of them to v (and get cost c_T的 (a − 1)+ |a| cost(v, w)). For negative a one can see that the min is always achieved by the second option (which is good because in this setting we are moving some armies out of the tree, and do not have the option of ignoring any of them).

Join the the two subtrees of a vertex v. Suppose TL and TR are the left and right subtrees of v after doing the extend operation above (so that they are both rooted at v), and let T denote the entire subtree rooted at v. Then it is not hard to see that

_Δ_T₌_Δ_T_L₊_Δ_T_R_Δ_T_L_≤_a_≤_a₋_Δ_T_R_c_T_L

cT (a)= min (a)+ cT_R(a − a).

To get the slow Ω(nX) solution, we can simply represent the functions cT as arrays of function values and implement the equations written above.

In order to speed this up, the perhaps key observation is that the functions cT are not only non-increasing, they are also convex (proof omitted, it can be derived from the identies above, or more abstractly from properties of min cost max ﬂows). I.e., there is a diminishing returns type of property – the savings in cost attained by moving in more armies decreases as we move in more armies. For this reason, instead of working with cT directly, it turns out to be more convenient to work with the function fT : Z≥0 → Z deﬁned by fT(a)= cT (ΔT + a) − cT(ΔT + a + 1) (i.e., fT(0) says how much the cost decreases when going from ΔT to ΔT + 1 armies moving out of T, and so on). So we will keep this function, and the value bT := cT(ΔT ), the base cost for the tree T if we only keep the bare minimum of soldiers inside it and move everyone else out of it. The fact that cT is non-increasing and convex then translates into the function fT being non-negative and non-increasing.

How does this change of perspective affect the identites for cT above? The join operation becomes particularly nice in this new perspective: it follows from the convexity property that if we take all the fT_Land fT_Rvalues and merge them into a big list sorted in non-increasing order, then that resulting list are the values of fT (abstractly, this is because the operation we are doing is taking the Minkowski sum of the two functions cT_Land cT_R, and the Minkowski sum of convex sets behaves nicely). This suggest the following way of representing the functions fT: keep the values as a sorted set, and when joining fT_Lwith fT_R, add the values from the smaller of the two sets to the larger of the two (and be careful not to do anything which takes linear time in the larger of the two sets). By a standard argument, this results in a total running time of O(n + X log²X) for doing all the joins.

For the extend operation, the identity for cT above now becomes

f_T的 (a)= max(0, fT(a) − sgn(ΔT + a) cost(v, w)

(where we deﬁne sgn(a)=+1 if a ≥ 0 and −1 if a < 0). Computing this by updating each individual fT-value would be too costly. Ignoring for a moment the cap at 0 and focusing only on the second argument, we see that the transformation performed on fT is fairly simple: it gets increased by cost(v, w) for all a < −ΔT (which are none, if ΔT ≥ 0, i.e., if the subtree does not have more surplus than demand), and decreased by the same amount for all a ≥−ΔT. We can arrange for this by augmenting the set structure suggested above with a split point and two global shifts – everything below the split point has been shifted by the ﬁrst shift, and everything above the split point has been shifted by the second split. When we do the extend operation, there may already be an existing split point and shifts, we then have to move the split point, which can be done in time which is O(d log d) where d is the distance between the two split points. This value might be as large as X, but in total over all extend operations done in the tree, the sum of these distances can never be more than 2X, so the total time we get from these moves of the split points is O(n + X log X). Coming back to the ignored max(0, ...) part from above, the easiest way to deal with that is to simply remove any negative values from the end of the set after having performed the operation above.

We also need to revisit the join operation and see that this can still be done with the augmented data structure with shifts. Implementationwise, instead of keeping a single ordered set with a split point, it is probably easier (at least all our implementations seemed to feel so) to keep two separate sets corresponding to the part of values below and the part of values above the split point. Another implementation detail is that using complete ordered sets is not necessary, one can get away with two simple max-heaps.