问题Problem

codility看到的题目, 题目内容如下

题目内容

英文题目

A non-empty array A consisting of N integers is given. A pair of integers $(P, Q)$, such that $0 ≤ P < Q < N$, is called a slice of array A (notice that the slice contains at least two elements). The average of a slice $(P, Q)$ is the sum of $A[P] + A[P + 1] + … + A[Q]$ divided by the length of the slice. To be precise, the average equals $(A[P] + A[P + 1] + … + A[Q]) / (Q − P + 1)$.

中文翻译

给定一个由 N 个整数组成的非空数组 A. 一个整数对 $(P, Q)$, 使得 $0 ≤ P < Q < N$, 被称为数组 A 的一个切片(注意切片至少包含两个元素). 切片 $(P, Q)$ 的_平均值_是 $A[P] + A[P + 1] + … + A[Q]$ 的总和除以切片的长度. 确切地说, 平均值等于 $(A[P] + A[P + 1] + … + A[Q]) / (Q − P + 1)$.

例子

英文例子

For example, array A such that:

A=[4,2,2,5,1,5,8]

contains the following example slices:

  • slice (1, 2), whose average is (2 + 2) / 2 = 2;
  • slice (3, 4), whose average is (5 + 1) / 2 = 3;
  • slice (1, 4), whose average is (2 + 2 + 5 + 1) / 4 = 2.5.

The goal is to find the starting position of a slice whose average is minimal.

the example array A should return 1, the minimum average slice is A[1:2]

  • N is an integer within the range [2..100,000];
  • each element of array A is an integer within the range [−10,000..10,000].

中文翻译

例如, 数组 A 如下:

A=[4,2,2,5,1,5,8]

包含以下示例切片:

  • 切片 (1, 2), 其平均值为 (2 + 2) / 2 = 2;
  • 切片 (3, 4), 其平均值为 (5 + 1) / 2 = 3;
  • 切片 (1, 4), 其平均值为 (2 + 2 + 5 + 1) / 4 = 2.5.

目标是找到一个切片的起始位置, 该切片的平均值最小.

目标数组A应该返回1, 平均值最小的切片为A[1:2]

  • N是在[2,100,000]之间的整数
  • A中的每一个元素范围都在[-10,000,10,000]

解决方案Solution

这是我写在stackOverFlow上面的一片回答

英文版本

At first, we should know a mathematical pattern: when the length of a list is greater than 3, it can be divided into several sub-slices of length 2 and 3. For example, if the length of a list is 5, it can be made up of 2 and 3; if the length is 4, it can be made up of 2 and 2.

In this way, if we have a list with length N: [a, b, c, d, ...], and N > 3, we set the sum of the elements in the list to be Sum, and the average to be Avg = Sum/N.

We want to prove that in this list, there is at least one of its sub-slices with a length of 2 or 3 that has an average less than or equal to Avg.

We use proof by contradiction, assuming that all of the sub-slices’ averages are greater than Avg.

By dividing the list into several contiguous sub-slices, each of length 2 or 3, where the last sub-slice could have a length of 2 or 3: $$ [a1, a2], [a3, a4], ….., [a_{N-2}, a_{N-1}, a_{N}] $$

or

$$ [a1, a2], [a3, a4], ….., [a_{N-1}, a_{N}] $$

According to our assumption, every sub-slice’s average is greater than Avg;

$$ (a1 + a2)/2 > Avg; \ (a3 + a4)/2 > Avg; \ ….. $$ Summing all the inequalities, we get: $$ \frac{a1 + a2}{2} + \frac{a3 + a4}{2} + … + \frac{a_{N-1} + a_N}{2} > Avg * \frac{N}{2} $$ The left side of the inequality equals $Sum/2$ or is greater than $Sum/2$ (if the last sub-slice has a length of 3); $$ Sum/2 > Avg * (N/2)\ Sum > Avg * N\ Sum > (Sum/N) * N\ Sum > Sum, $$ $Sum>Sum$, which is a contradiction.

We get the contradiction, so our assumption does not hold.

The conclusion is that if there is a list with a length greater than 3, it must have a sub-slice with a length of 2 or 3 that has an average less than or equal to the whole list’s average.

In this way, we just need to find the minimum average of all the sub-slices with lengths of 2 or 3;

here is a Python version example

def solution(A):
    n = len(A)
    if n == 2:
        return 0
    min_avg = float('inf')
    min_start = 0
    for i in range(n - 1):
        avg = (A[i] + A[i + 1]) / 2
        if avg < min_avg:
            min_avg = avg
            min_start = i
    for i in range(n - 2):
        avg = (A[i] + A[i + 1] + A[i + 2]) / 3
        if avg < min_avg:
            min_avg = avg
            min_start = i
    return min_start

中文版本

首先, 我们应该了解一个数学定理: 当列表的长度大于3时, 可以将其分成几个长度为2和3的子切片. 例如, 如果列表长度为5, 则可以由2和3组成; 如果长度为4, 则可以由2和2组成.

这样, 如果我们有一个长度为N的列表: [a, b, c, d, ...], 并且N > 3, 我们将列表中元素的总和设为Sum, 平均值设为Avg = Sum/N.

我们想要证明, 在这个列表中, 至少有一个长度为2或3的子切片的平均值小于或等于Avg.

我们采用反证法, 假设所有子切片的平均值都大于Avg.

通过将列表划分为若干个连续的子切片, 每个长度为2或3, 其中最后一个子切片的长度可能为2或3: $$ [a1, a2], [a3, a4], ….., [a_{N-2}, a_{N-1}, a_{N}] $$

或者

$$ [a1, a2], [a3, a4], ….., [a_{N-1}, a_{N}] $$

根据我们的假设, 每个子切片的平均值都大于Avg;

$$ \frac{(a1 + a2)}{2} > Avg; \ \frac{(a3 + a4)}{2} > Avg; \ ….. $$ 将所有不等式相加, 我们得到: $$ \frac{a1 + a2}{2} + \frac{a3 + a4}{2} + … + \frac{a_{N-1} + a_N}{2} > Avg * \frac{N}{2} $$ 不等式的左侧等于$Sum/2$或大于$Sum/2$(如果最后一个子切片的长度为3); $$ Sum/2 > Avg * (N/2)\ Sum > Avg * N\ Sum > (Sum/N) * N\ Sum > Sum, $$ $Sum>Sum$, 这显然是矛盾的.

我们得到了矛盾, 因此我们的假设不成立.

结论是, 如果有一个长度大于3的列表, 它必然有一个长度为2或3的子切片, 其平均值小于或等于整个列表的平均值.

这样, 我们只需要找到所有长度为2或3的子切片的最小平均值;

这里是一个Python版本的示例:

def solution(A):
    n = len(A)
    if n == 2:
        return 0
    min_avg = float('inf')
    min_start = 0
    for i in range(n - 1):
        avg = (A[i] + A[i + 1]) / 2
        if avg < min_avg:
            min_avg = avg
            min_start = i
    for i in range(n - 2):
        avg = (A[i] + A[i + 1] + A[i + 2]) / 3
        if avg < min_avg:
            min_avg = avg
            min_start = i
    return min_start