Category Archives: Mathematics

用 Python 来研究数学 — SymPy 符号工具包介绍

SymPy 的简单介绍

SymPy 是一个符号计算的 Python 库,完全由 Python 写成,为许多数值分析,符号计算提供了重要的工具。SymPy 的第一个版本于 2007 年开源,并且经历了十几个版本的迭代,在 2019 年已经基于修正的 BSD 许可证开源了 1.4 版本。SymPy 的开源地址和官方网站分别是:

  1. GitHub 链接:https://github.com/sympy/sympy
  2. SymPy 官方网站:https://www.sympy.org/en/index.html
sympy_logo
SymPy 的 logo

SymPy 的 1.4 版本文档中,可以看出,SymPy 可以支持很多初等数学,高等数学,甚至研究生数学的符号计算。在初等数学和高等数学中,SymPy 可以支持的内容包括但不限于:

  1. 基础计算(Basic Operations);
  2. 公式简化(Simplification);
  3. 微积分(Calculus);
  4. 解方程(Solver);
  5. 矩阵(Matrices);
  6. 几何(geometry);
  7. 级数(Series);

在更多的数学领域中,SymPy 可以支持的内容包括但不限于:

  1. 范畴论(Category Theory);
  2. 微分几何(Differential Geometry);
  3. 常微分方程(ODE);
  4. 偏微分方程(PDE);
  5. 傅立叶变换(Fourier Transform);
  6. 集合论(Set Theory);
  7. 逻辑计算(Logic Theory)。
sympy_tutorial
SymPy 的教学目录

SymPy 的工具库介绍

SymPy 的基础计算

在数学中,基础的计算包括实数和复数的加减乘除,那么就需要在程序中描述出实数与复数。著名的欧拉公式

e^{i\pi}+1 = 0

正好用到了数学中最常见的五个实数。在 SymPy 里面,e, i, \pi, \infty 是用以下符号来表示的:其中 sympy.exp() 表示以 e 为底的函数。

sympy.exp(1), sympy.I, sympy.pi, sympy.oo

而想要计算欧拉公式的话,只需要输入下面的公式即可:

>>> sympy.exp(sympy.I * sympy.pi) + 1
0

如果需要看 e, \pi 的小数值,可以使用 evalf() 函数,其中 evalf() 函数里面的值表示有效数字的位数。例如下面就是精确到 10 位有效数字。当然,也可以不输入。

>>> sympy.E.evalf(10)
2.718281828
>>> sympy.E.evalf()
2.71828182845905
>>> sympy.pi.evalf(10)
3.141592654
>>> sympy.pi.evalf()
3.14159265358979

除此之外,如果需要查看某个实数的有效数字,也是类似操作的:

>>> expr = sympy.sqrt(8)
>>> expr.evalf()
2.82842712474619

而对于实数的加减乘除,则可以如下操作:

>>> x, y= sympy.symbols("x y")
>>> x + y
x + y
>>> x - y
x - y
>>> x * y
x*y
>>> x / y
x/y

而对于复数的加减乘除,则是类似的操作,令两个复数分别是 z_{1} = x_{1} + i y_{1}z_{2} = x_{2} + i y_{2}

>>> x1, y1, x2, y2 = sympy.symbols("x1 y1 x2 y2")
>>> z1 = x1 + y1 * sympy.I
x1 + I*y1
>>>  z2 = x2 + y2 * sympy.I
x2 + I*y2
>>> z1 + z2
x1 + x2 + I*y1 + I*y2
>>> z1 - z2
x1 - x2 + I*y1 - I*y2
>>> z1 * z2
(x1 + I*y1)*(x2 + I*y2)
>>> z1 / z2
(x1 + I*y1)/(x2 + I*y2)

对于多项式而言,有的时候我们希望将其展开,有的时候则需要将其合并,最终将其简化成最简单的形式。

>>> sympy.expand((x+1)**2)
x**2 + 2*x + 1
>>> sympy.expand((x+1)**5)
x**5 + 5*x**4 + 10*x**3 + 10*x**2 + 5*x + 1
>>> sympy.factor(x**3+1)
(x + 1)*(x**2 - x + 1)
>>> sympy.factor(x**2+3*x+2)
(x + 1)*(x + 2)
>>> sympy.simplify(x**2 + x + 1 - x)
x**2 + 1
>>> sympy.simplify(sympy.sin(x)**2 + sympy.cos(x)**2)
1

在多变量的场景下,SymPy 也可以对其中的某个变量合并同类项,同时还可以计算某个变量的某个次数所对应的系数是多少,例如:

>>> expr = x*y + x - 3 + 2*x**2 - x**2 + x**3 + y**2 + x**2*y**2
>>> sympy.collect(expr,x)
x**3 + x**2*(y**2 + 1) + x*(y + 1) + y**2 - 3
>>> sympy.collect(expr,y)
x**3 + x**2 + x*y + x + y**2*(x**2 + 1) - 3
>>> expr.coeff(x, 2)
y**2 + 1
>>> expr.coeff(y, 1)
x

有理函数形如 f(x) = p(x)/q(x),其中 p(x)q(x) 都是多项式。一般情况下,我们希望对有理函数进行简化,合并或者分解的数学计算。

在需要合并的情形下,如果想把有理函数处理成标准格式 p(x)/q(x) 并且去除公因子,那么可以使用 cancel 函数。另一个类似的就是 together 函数,但是不同之处在于 cancel 会消除公因子,together 不会消除公因子。例如:

expr = \frac{x^{2}+3x+2}{x^{2}+x}

>>> expr = (x**2 + 3*x + 2)/(x**2 + x)
>>> sympy.cancel(expr)
(x + 2)/x
>>> sympy.together(expr)
(x**2 + 3*x + 2)/(x*(x + 1))

除了合并和消除公因子之外,有的时候还希望对分子和分母进行因式分解,例如:

expr = (x**2 + 3*x + 2)/(x**2 + x)
>>> sympy.factor(expr)
(x + 2)/x
>>> expr = (x**3 + 3*x**2 + 2*x)/(x**5+x)
>>> sympy.factor(expr)
(x + 1)*(x + 2)/(x**4 + 1)
>>> expr = x**2 + (2*x+1)/(x**3+1)
>>> sympy.factor(expr)
(x**5 + x**2 + 2*x + 1)/((x + 1)*(x**2 - x + 1))

合并的反面就是部分分式展开(Partial Fraction Decomposition),它是把有理函数分解成多个次数较低的有理函数和的形式。这里需要用 apart 函数:

>>> expr = (x**4 + 3*x**2 + 2*x)/(x**2+x)
>>> sympy.apart(expr)
x**2 - x + 4 - 2/(x + 1)
>>> expr = (x**5 + 1)/(x**3+1)
>>> sympy.apart(expr)
x**2 - (x - 1)/(x**2 - x + 1)

在 SymPy 里面,同样支持各种各样的三角函数,例如:三角函数的简化函数 trigsimp,三角函数的展开 expand_trig,

>>> expr = sympy.sin(x)**2 + sympy.cos(x)**2
>>> sympy.trigsimp(expr)
1
>>> sympy.expand_trig(sympy.sin(x+y))
sin(x)*cos(y) + sin(y)*cos(x)
>>> sympy.expand_trig(sympy.cos(x+y))
-sin(x)*sin(y) + cos(x)*cos(y)
>>> sympy.trigsimp(sympy.sin(x)*sympy.cos(y) + sympy.sin(y)*sympy.cos(x))
sin(x + y)
>>> sympy.trigsimp(-sympy.sin(x)*sympy.sin(y) + sympy.cos(x)*sympy.cos(y))
cos(x + y)

同样的,在乘幂上面,同样有简化函数 powsimp,效果与之前提到的 simplify 一样。除此之外,还可以根据底数来做合并,即分别使用 expand_power_exp 函数与 expand_power_base 函数。

>>> sympy.powsimp(x**z*y**z*x**z)
x**(2*z)*y**z
>>> sympy.simplify(x**z*y**z*x**z)
x**(2*z)*y**z
>>> sympy.expand_power_exp(x**(y + z))
x**y*x**z
>>> sympy.expand_power_base(x**(y + z))
x**(y + z)

作为指数的反函数对数,sympy 也是有着类似的展开合并函数,expand_log,logcombine 承担了这样的角色。

\ln(xy) = \ln(x) + \ln(y)

\ln(x/y) = \ln(x) - \ln(y)

>>> sympy.expand_log(sympy.log(x*y), force=True)
log(x) + log(y)
>>> sympy.expand_log(sympy.log(x/y), force=True)
log(x) - log(y)

 

SymPy 的微积分工具

下面,我们会从一个最基本的函数 f(x) = 1/x 出发,来介绍 SymPy 的各种函数使用方法。如果想进行变量替换,例如把 x 变成 y,那么可以使用 substitution 方法。除此之外,有的时候也希望能够得到函数 f 在某个点的取值,例如 f(1) = 1/1 = 1,那么可以把参数换成 1 即可得到函数的取值。例如,

>>> import sympy
>>> x = sympy.Symbol("x")
>>> f = 1 / x
1/x
>>> y = sympy.Symbol("y")
>>> f = f.subs(x,y)
1/y
>>> f = f.subs(y,1)
1

在微积分里面,最常见的概念就是极限,SymPy 里面的极限函数就是 limit。使用方法如下:

>>> f = 1/x
>>> sympy.limit(f,x,0)
oo
>>> sympy.limit(f,x,2)
1/2
>>> sympy.limit(f,x,sympy.oo)
0
>>> g = x * sympy.log(x)
>>> sympy.limit(g,x,0)
0

对于函数 f(x) = 1/x 而言,它的导数计算函数是 diff,n 阶导数也可以用这个函数算。

>>> f = 1/x
>>> sympy.diff(f,x)
-1/x**2
>>> sympy.diff(f,x,2)
2/x**3
>>> sympy.diff(f,x,3)
-6/x**4
>>> sympy.diff(f,x,4)
24/x**5

提到 n 阶导数,就必须要提一下 Taylor Series 了。对于常见函数的 Taylor Series,SymPy 也是有非常简便的方法,那就是 series 函数。其参数包括 expr, x, x0, n, dir,分别对应着表达式,函数的自变量,Taylor Series 的中心点,n 表示阶数,dir 表示方向,包括”+-“,”-“,”+”,分别表示 x\rightarrow x0, x\rightarrow x0^{-}, x\rightarrow x0^{+}

sympy.series.series.series(exprx=Nonex0=0n=6dir='+') >>> g = sympy.cos(x) >>> sympy.series(g, x) 1 - x**2/2 + x**4/24 + O(x**6) >>> sympy.series(g, x, x0=1, n=10) cos(1) - (x - 1)*sin(1) - (x - 1)**2*cos(1)/2 + (x - 1)**3*sin(1)/6 + (x - 1)**4*cos(1)/24 - (x - 1)**5*sin(1)/120 - (x - 1)**6*cos(1)/720 + (x - 1)**7*sin(1)/5040 + (x - 1)**8*cos(1)/40320 - (x - 1)**9*sin(1)/362880 + O((x - 1)**10, (x, 1))

积分的计算函数是 integrate,包括定积分与不定积分:

\int\frac{1}{x}dx = \ln(x)+C

\int_{1}^{2}\frac{1}{x}dx = \ln(2)

>>> f = 1/x
>>> sympy.integrate(f,x)
log(x)
>>> sympy.integrate(f, (x,1,2))
log(2)

对于广义积分而言,就需要用到 \infty 这个概念了,但是在 SymPy 里面的写法还是一样的。

\int_{-\infty}^{0}e^{-x^{2}}dx=\frac{\sqrt{\pi}}{2}

\int_{0}^{+\infty}e^{-x}dx = 1

\int_{-\infty}^{+\infty}\int_{-\infty}^{+\infty}e^{-x^{2}-y^{2}}dxdy = \pi

>>> g = sympy.exp(-x**2)
>>> sympy.integrate(g, (x,-sympy.oo,0))
sqrt(pi)/2
>>> g = sympy.exp(-x)
>>> sympy.integrate(g, (x, 0, sympy.oo))
1
>>> h = sympy.exp(-x**2 - y**2)
>>> sympy.integrate(h, (x,-sympy.oo, sympy.oo), (y, -sympy.oo, sympy.oo))
pi

 

SymPy 的方程工具

在初等数学中,通常都存在一元一次方程,一元二次方程等,并且在不同的域上有着不同的解。SymPy 里面的相应函数就是 solveset,根据定义域的不同,可以获得完全不同的解。

\{x\in\mathbb{R}: x^{3}-1=0\}

\{x\in\mathbb{C}:x^{3}-1=0\}

\{x\in\mathbb{R}:e^{x}-x=0\}

\{x\in\mathbb{R}:e^{x}-1=0\}

\{x\in\mathbb{C}:e^{x}-1=0\}

>>> sympy.solveset(sympy.Eq(x**3,1), x, domain=sympy.S.Reals)
{1}
>>> sympy.solveset(sympy.Eq(x**3,1), x, domain=sympy.S.Complexes)
{1, -1/2 - sqrt(3)*I/2, -1/2 + sqrt(3)*I/2}
>>> sympy.solveset(sympy.Eq(x**3 - 1,0), x, domain=sympy.S.Reals)
{1}
>>> sympy.solveset(sympy.Eq(x**3 - 1,0), x, domain=sympy.S.Complexes)
{1, -1/2 - sqrt(3)*I/2, -1/2 + sqrt(3)*I/2}
>>> sympy.solveset(sympy.exp(x),x)
EmptySet()
>>> sympy.solveset(sympy.exp(x)-1,x,domain=sympy.S.Reals)
{0}
>>> sympy.solveset(sympy.exp(x)-1,x,domain=sympy.S.Complexes)
ImageSet(Lambda(_n, 2*_n*I*pi), Integers)

在这里,Lambda 表示的是数学公式,第一个是自变量,第二个是函数,最后是自变量的定义域。

在线性代数中,最常见的还是多元一次方程组,那么解法是一样的:

\begin{cases}x+y-10=0 \\ x-y-2=0\end{cases}

>>> sympy.solve([x+y-10, x-y-2], [x,y])
{x: 6, y: 4}

对于三角函数,也是类似的写法:

\begin{cases} \sin(x-y)=0 \\ \cos(x+y)=0 \end{cases}

>>> sympy.solve([sympy.sin(x-y), sympy.cos(x+y)], [x,y])
[(-pi/4, 3*pi/4), (pi/4, pi/4), (3*pi/4, 3*pi/4), (5*pi/4, pi/4)]

 

SymPy 的矩阵工具

在矩阵论中,最常见的就是单位矩阵了,而单位矩阵只与一个参数有关,那就是矩阵的大小。下面就是 3*3,3*2,2*3 大小的矩阵。

>>> sympy.eye(3)
Matrix([
[1, 0, 0],
[0, 1, 0],
[0, 0, 1]])
>>> sympy.eye(3,2)
Matrix([
[1, 0],
[0, 1],
[0, 0]])
>>> sympy.eye(2,3)
Matrix([
[1, 0, 0],
[0, 1, 0]])

另外还有全零和全一矩阵,意思就是矩阵中的所有值全部是零和一。

>>> sympy.ones(2,3)
Matrix([
[1, 1, 1],
[1, 1, 1]])
>>> sympy.zeros(3,2)
Matrix([
[0, 0],
[0, 0],
[0, 0]])

而对角矩阵也可以使用 diag 轻松获得:

>>> sympy.diag(1,1,2)
Matrix([
[1, 0, 0],
[0, 1, 0],
[0, 0, 2]])

而矩阵的加法,减法,乘法,逆运算,转置,行列式,SymPy 都是可以支持的:

A = \left(\begin{array}{cc} 1 & 1 \\ 0 & 2 \end{array}\right)

B = \left(\begin{array}{cc} 1 & 0 \\ 1 & 1 \end{array}\right)

>>> A = sympy.Matrix([[1,1],[0,2]])
>>> B = sympy.Matrix([[1,0],[1,1]])
>>> A
Matrix([
[1, 1],
[0, 2]])
>>> B
Matrix([
[1, 0],
[1, 1]])
>>> A + B
Matrix([
[2, 1],
[1, 3]])
>>> A - B
Matrix([
[ 0, 1],
[-1, 1]])
>>> A * B
Matrix([
[2, 1],
[2, 2]])
>>> A * B**-1
Matrix([
[ 0, 1],
[-2, 2]])
>>> B**-1
Matrix([
[ 1, 0],
[-1, 1]])
>>> A.T
Matrix([
[1, 0],
[1, 2]])
>>> A.det()
2

在某些情况下,需要对矩阵进行加上一行或者加上一列的操作,在这里就可以使用 row_insert 或者 col_insert 函数:第一个参数表示插入的位置,第二个参数就是相应的行向量或者列向量。而在删除上就很简单了,直接使用 row_del 或者 col_del 即可。

>>> A
Matrix([
[1, 1],
[0, 2]])
>>> A.row_insert(1, sympy.Matrix([[10,10]]))
Matrix([
[ 1, 1],
[10, 10],
[ 0, 2]])
>>> A.col_insert(0, sympy.Matrix([3,3]))
Matrix([
[3, 1, 1],
[3, 0, 2]])
>>> A.row_del(0)
>>> A
Matrix([[0, 2]])
>>> A.col_del(1)
>>> A
Matrix([[0]])

在对角化方面,同样可以使用 eigenvals(),eigenvecs(), diagonalize() 函数:

>>> A
Matrix([
[1, 1],
[0, 2]])
>>> A.eigenvals()
{2: 1, 1: 1}
>>> A.eigenvects()
[(1, 1, [Matrix([
[1],
[0]])]), (2, 1, [Matrix([
[1],
[1]])])]
>>> A.diagonalize()
(Matrix([
[1, 1],
[0, 1]]), Matrix([
[1, 0],
[0, 2]]))

在 eigenvals() 返回的结果中,第一个表示特征值,第二个表示该特征值的重数。在特征向量 eigenvecs() 中,第一个表示特征值,第二个表示特征值的重数,第三个表示特征向量。在对角化 diagonalize() 中,第一个矩阵表示 P,第二个矩阵表示 DA = P*D*P^{-1}

在矩阵中,最常见的还是多元一次方程的解。如果要求 Ax =b 的解,可以有以下方案:

>>> A = sympy.Matrix([[1,1],[0,2]])
>>> A
Matrix([
[1, 1],
[0, 2]])
>>> b = sympy.Matrix([3,5])
>>> b
Matrix([
[3],
[5]])
>>> A**-1*b
Matrix([
[1/2],
[5/2]])
>>> sympy.linsolve((A,b))
{(1/2, 5/2)}
>>> sympy.linsolve((A,b),[x,y])
{(1/2, 5/2)}

 

SymPy 的集合论工具

集合论可以说是数学的基础,在任何数学的方向上都能够看到集合论的身影。在 SymPy 里面,有一个类叫做 sympy.sets.sets.set。在集合论里面,常见的就是边界,补集,包含,并集,交集等常见的操作。但是感觉 SymPy 中的集合论操作主要集中在实数域或者复数域上。

对于闭区间 I=[0,1] 和开区间 J = (0,1) 而言,在 SymPy 中使用以下方法来表示:

I = sympy.Interval(0,1)
J = sympy.Interval.open(0,1)
K = sympy.Interval(0.5,2)

其开始和结束的点可以分别使用 start 和 end 来表示:

>>> I.start
0
>>> I.end
1

其长度用 measure 来表示:

>>> I.measure
1

其闭包用 closure 来表示:

>>> I.closure
Interval(0, 1)

其内点用 interior 来表示:

>>> I.interior
Interval.open(0, 1)

判断其边界条件可以使用 left_open 或者 right_open 来做:

>>> I.left_open
False
>>> I.right_open
False

对于两个集合之间的操作,可以参考以下方法:

I = sympy.Interval(0,1)
K = sympy.Interval(0.5,2)
>>> I.intersect(K)
Interval(0.500000000000000, 1)
>>> I.union(K)
Interval(0, 2)
>>> I-K
Interval.Ropen(0, 0.500000000000000)
>>> K-I
Interval.Lopen(1, 2)
>>> I.symmetric_difference(K)
Union(Interval.Ropen(0, 0.500000000000000), Interval.Lopen(1, 2))

实数集 \mathbb{R} 在 SymPy 中用 sympy.S.Reals 来表示,自然数使用 sympy.S.Naturals,非负整数用 sympy.S.Naturals0,整数用 sympy.S.Integers 来表示。补集的计算可以用减号,也可以使用 complement 函数。

>>> sympy.S.Reals
Reals
>>> sympy.S.Reals-I
Union(Interval.open(-oo, 0), Interval.open(1, oo))
>>> I.complement(sympy.S.Reals)
Union(Interval.open(-oo, 0), Interval.open(1, oo))
>>> sympy.S.Reals.complement(I)
EmptySet()
>>> I.complement(K)
Interval.Lopen(1, 2)
>>> I.complement(sympy.S.Reals)
Union(Interval.open(-oo, 0), Interval.open(1, oo))

 

SymPy 的逻辑工具

在逻辑运算中,我们可以使用 A, B, C 来代表元素。&, |, ~, >> 分别表示 AND,OR,NOT,imply。而逻辑运算同样可以使用 sympy.simplify_logic 简化。

A, B, C = sympy.symbols("A B C")
>>> sympy.simplify_logic(A | (A & B))
A
>>> sympy.simplify_logic((A>>B) & (B>>A))
(A & B) | (~A & ~B)
>>> A>>B
Implies(A, B)

 

SymPy 的级数工具

SymPy 的级数工具有一部分放在具体数学(Concrete Mathematics)章节了。有的时候,我们希望计算某个级数是发散的,还是收敛的,就可以使用 is_convergence 函数。考虑最常见的级数:

\sum_{n=1}^{\infty}\frac{1}{n} = +\infty

\sum_{n=1}^{\infty}\frac{1}{n^{2}} = \frac{\pi^{2}}{6}

>>> n = sympy.Symbol("n", integer=True)
>>> sympy.Sum(1/n, (n,1,sympy.oo)).is_convergent()
False
>>> sympy.Sum(1/n**2, (n,1,sympy.oo)).is_convergent()
True

如果想计算出收敛级数的值,加上 doit() 函数即可;如果想计算有效数字,加上 evalf() 函数即可。

>>> sympy.Sum(1/n**2, (n,1,sympy.oo)).evalf()
1.64493406684823
>>> sympy.Sum(1/n**2, (n,1,sympy.oo)).doit()
pi**2/6
>>> sympy.Sum(1/n**3, (n,1,sympy.oo)).evalf()
1.20205690315959
>>> sympy.Sum(1/n**3, (n,1,sympy.oo)).doit()
zeta(3)

除了加法之外,SymPy 也支持连乘,其符号是 sympy.Product,考虑

\prod_{n=1}^{+\infty}\frac{n}{n+1}

\prod_{n=1}^{+\infty}\cos\left(\frac{\pi}{n}\right)

>>> sympy.Product(n/(n+1), (n,1,sympy.oo)).is_convergent()
False
>>> sympy.Product(sympy.cos(sympy.pi/n), (n, 1, sympy.oo)).is_convergent()
True

 

SymPy 的 ODE 工具

在常微分方程(Ordinary Differential Equation)中,最常见的就是解方程,而解方程主要是靠 dsolve 函数。例如想求解以下的常微分方程:

df/dx + f(x) = 0,

d^{2}f/dx^{2} + f(x) = 0

d^{3}f/dx^{3} + f(x) = 0

可以使用 dsolve 函数:

>>> f = sympy.Function('f')
>>> sympy.dsolve(sympy.Derivative(f(x),x) + f(x), f(x))
Eq(f(x), C1*exp(-x))
>>> sympy.dsolve(sympy.Derivative(f(x),x,2) + f(x), f(x))
Eq(f(x), C1*sin(x) + C2*cos(x))
>>> sympy.dsolve(sympy.Derivative(f(x),x,3) + f(x), f(x))
Eq(f(x), C3*exp(-x) + (C1*sin(sqrt(3)*x/2) + C2*cos(sqrt(3)*x/2))*sqrt(exp(x)))

而常微分方程对于不同的方程类型也有着不同的解法,可以使用 classify_ode 来判断常微分方程的类型:

>>> sympy.classify_ode(sympy.Derivative(f(x),x) + f(x), f(x))
('separable', '1st_exact', '1st_linear', 'almost_linear', '1st_power_series', 'lie_group', 'nth_linear_constant_coeff_homogeneous', 'separable_Integral', '1st_exact_Integral', '1st_linear_Integral', 'almost_linear_Integral')
>>> sympy.classify_ode(sympy.Derivative(f(x),x,2) + f(x), f(x))
('nth_linear_constant_coeff_homogeneous', '2nd_power_series_ordinary')
>>> sympy.classify_ode(sympy.Derivative(f(x),x,3) + f(x), f(x))
('nth_linear_constant_coeff_homogeneous',)

 

SymPy 的 PDE 工具

在偏微分方程(Partitial Differential Equation)中,同样可以直接求解和判断偏微分方程的类型,分别使用函数 pdsolve() 和 classify_pde()。假设 f = f(x,y) 是一个二元函数,分别满足以下偏微分方程:

\partial f/\partial x + \partial f/\partial y =0

\partial f/\partial x + \partial f/\partial y + f = 0

\partial f/\partial x + \partial f/\partial y + f + 10 = 0

>>> f = sympy.Function("f")(x,y)
>>> sympy.pdsolve(sympy.Derivative(f,x)+sympy.Derivative(f,y),f)
Eq(f(x, y), F(x - y))
>>> sympy.pdsolve(f.diff(x)+f.diff(y)+f,f)
Eq(f(x, y), F(x - y)*exp(-x/2 - y/2))
>>> sympy.pdsolve(f.diff(x)+f.diff(y)+f+10,f)
Eq(f(x, y), F(x - y)*exp(-x/2 - y/2) - 10)

查看类型就用 classify_pde() 函数:

>>> sympy.classify_pde(f.diff(x)+f.diff(y)+f)
('1st_linear_constant_coeff_homogeneous',)
>>> sympy.classify_pde(f.diff(x)+f.diff(y)+f+10,f)
('1st_linear_constant_coeff', '1st_linear_constant_coeff_Integral')
>>> sympy.classify_pde(f.diff(x)+f.diff(y)+f+10,f)
('1st_linear_constant_coeff', '1st_linear_constant_coeff_Integral')

不过目前的 PDE 解法貌似只支持一阶偏导数,二阶或者以上的偏导数就不支持了。

 

SymPy 的数论工具

在数论中,素数就是一个最基本的概念之一。而素数的批量计算,比较快的方法就是筛法(sieve method)。在 sympy 中,同样有 sympy.sieve 这个工具,用于计算素数。如果想输出前100个素数,那么

>>> sympy.sieve._reset()
>>> sympy.sieve.extend_to_no(100)
>>> sympy.sieve._list
array('l', [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283, 293, 307, 311, 313, 317, 331, 337, 347, 349, 353, 359, 367, 373, 379, 383, 389, 397, 401, 409, 419, 421, 431, 433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491, 499, 503, 509, 521, 523, 541, 547, 557, 563, 569, 571, 577, 587, 593, 599, 601, 607, 613, 617, 619, 631])

如果想输出一个区间内的所有素数,可以使用 primerange(a,b) 函数:

>>> [i for i in sympy.sieve.primerange(10,100)]
[11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]

search() 函数是为了计算某个数附近是第几个素数:

>>> sympy.sieve.search(10)
(4, 5)
>>> sympy.sieve.search(11)
(5, 5)

如果只想获得第 n 个素数,则使用函数 sympy.ntheory.generate.prime(n) 即可。如果是希望计算 x 后面的下一个素数,使用 sympy.ntheory.generate.nextprime(x) 即可。判断 x 是否是素数,可以使用 sympy.ntheory.generate.isprime(x)。

>>> sympy.ntheory.generate.prime(10)
29
>>> sympy.ntheory.generate.nextprime(10)
11
>>> sympy.ntheory.generate.nextprime(11)
13
>>> sympy.ntheory.generate.isprime(11)
True
>>> sympy.ntheory.generate.isprime(12)
False

除此之外,SymPy 的数论方法还有很多,需要读者根据 SymPy 的官方文档自行探索。

 

SymPy 的范畴论工具

SymPy 还支持范畴论(Category Theory)的一些计算方法,在这里简要地列举一下。

>>> A = sympy.categories.Object("A")
>>> B = sympy.categories.Object("B")
>>> f = sympy.categories.NamedMorphism(A,B,"f")
>>> f.domain
Object("A")
>>> f.codomain
Object("B")

由于范畴论是数学的“黑话”,因此其余方法留给范畴论的科研人员自行翻阅。

总结:

整体来看,SymPy 是一个非常卓越的 Python 开源符号计算库。在符号计算领域,不仅支持常见的微积分,线性代数,几何运算,还支持集合论,微分方程,数论等诸多数学方向。后续笔者将会持续跟进并研究这一卓越的开源工具库。

 

参考文献:

  1. Meurer A, Smith C P, Paprocki M, et al. SymPy: symbolic computing in Python[J]. PeerJ Computer Science, 2017, 3: e103.
  2. GitHub:https://github.com/sympy/sympy
  3. SymPy:https://www.sympy.org/en/index.html
  4. Sympy 维基百科:https://en.wikipedia.org/wiki/SymPy
  5. GreatX’s Blog:数值 Python:符号计算:https://vlight.me/2018/04/01/Numerical-Python-Symbolic-Computing/
  6. SymPy 符号计算-让Python帮我们推公式:https://zhuanlan.zhihu.com/p/83822118
  7. Python 科学计算利器—SymPy库:https://www.jianshu.com/p/339c91ae9f41

 

Advertisement

本科学数学专业是一个很好的选择吗?

知乎问题:https://www.zhihu.com/question/319574112

选专业这件事情其实是因人而异的,很难对所有的人给出一个标准的答案,肯定是基于每个人的不同条件来给出完全不同的建议。这是一个千人千面的问题,而不是一个数学题,通常只有一个标准的解答。

就个人这几年的经验来看,无论选择什么专业,都会有这个专业的优势和劣势,好比科技是一把双刃剑,专业也是同样的道理。在这种情况下,我们就需要分析数学专业究竟有哪些优势,有哪些劣势。只有这样,我们才能够根据自身的情况来具体分析,然后决定最终是否选择数学专业。

数学专业的劣势

于是,我们来看一下数学专业的劣势有哪些:

  • 理论知识太多;
  • 实用技能偏少;
  • 转行需要时间。

下面来逐一解读以上几点。我们可以先看一下数学系的常见课表:

  • 第一年:数学分析,高等代数,解析几何,C++等;
  • 第二年:常微分方程,离散数学,复分析,概率论,数值计算,抽象代数等;
  • 第三年:实分析,泛函分析,偏微分方程,拓扑学,微分几何,偏微分方程数值解,随机过程,数理统计等。

从课表上面来看,基本上可以确定几个结论。首先,数学专业作为基础学科,其特点就是理论知识偏多,而学习到的技能偏少,毕竟所学的内容都是理论型,培养的学生都是理论型选手。因此直接导致的结果就是数学系的学生掌握了一堆理论,但是却没有办法把它们直接转化成生产力。在实战中,总不能就靠一门 C++ 来谋求工作吧。其次,既然数学系传授给学生的实用的技能偏少,那么数学系的学生在需要转行的话,就肯定要补充新的技能。在从理论派走向实战派的过程中,不仅要找好自己的前进方向,还需要花费一定的时间和精力去转行。在这里需要澄清一点,转行并不是轻轻松松,而是需要花费时间,勇气和精力的。

如果不想继续从事数学科研的话,其实还是建议数学系的学生可以早一点进入公司去实习或者工作,至少在公司能够体验一下与学术界完全不同的人生。人生总是有多种可能性的,其实可以在本科或者硕士阶段多去体验一下人生。与数学系不同的是,对于计算机或者工程类专业的学生而言,到了本科一定的阶段,都会从事某个项目或者大作业,这种时候他们就会在边看边学中得到一定的成长,实践能力的训练其实比数学系的人会早很多。其实数学系也有实践,只不过延后了许多,一般只有到了博士生的阶段才需要进行科研的训练和动手的操作,在本科和硕士阶段一般是不需要的,因为现代数学的发展已经不是大部分硕士生能够完成的了,当然优秀的人总是有的。

数学专业的优势

在讲了数学专业的劣势之后,也需要强调一下数学专业的优势,其优势包括:

  • 底层通用技能;
  • 技能不易淘汰;
  • 逻辑思维能力;
  • 转行就业面广。

众所周知,无论是在学术界,还是工业界,数学基本上就是一切的基础。如果计算机行业没有数学,那就是XX计算机学院与XX培训班的区别;如果金融行业没有数学,那就是文艺复兴公司与XX小银行的区别。虽然我们不能够一味的拔高数学在各个行业的作用,但是很多行业还是离不开数学的。这就是所谓的底层通用技能,无论是计算机,金融还是其他领域,都离不开数学的支持。

除此之外,再次回到那张数学系的常见课表:

  • 第一年:数学分析,高等代数,解析几何,C++等;
  • 第二年:常微分方程,离散数学,复分析,概率论,数值计算,抽象代数等;
  • 第三年:实分析,泛函分析,偏微分方程,拓扑学,微分几何,偏微分方程数值解,随机过程,数理统计等。

当年读本科的时候是这张课表,其实过了十年,也是这张课表。本科的数学课程基本上集中在20世纪初之前的数学内容,最多到了20世纪中期。而微积分的发展时间则更加久远了。对于数学系的教育而言,很难做到跳过数学分析,高等代数的教育直接进入实分析和泛函分析。就算老师能够教,学生也听不懂啊,还是只能够从基础一步一步开始。而工业界用到的数学,通常也就是数学本科三年级的所有课程就能够包括了。很难用到很多研究生方面的知识,甚至很多时候也就用用微积分和线性代数,概率论就足够了。因此,一旦学会了这些课程,则是终身受益的知识,因为数学的另外一个特点就是永恒性。无论个人发生什么,学校发生什么,世界发生了什么,数学定理就是数学定理,一旦被证明且确定了证明是正确的,那就是永恒的。个人会死亡,学校也有可能走向没落,世界也有可能发生变化,但是数学定理就像一个永恒的石头永远放在那里。

除了以上两点,数学是最能够培养学生逻辑思维的学科。在以上的本科生课程里面,几乎所有东西都是从几个公理出发,然后通过严格的证明,一点一点地得到最终的结论,并且构建出整个数学大厦。在本科教育里面,数学系的学生只要认真学习,通常来说,逻辑思维能力和数学推导能力都会得到一个很大的提升。并且在后续的学习或者工作中,数学的烙印都会深深地印在身上。

最后,数学的就业面其实是相对宽广很多,主要包括:

  • 科研工作者:数学界,金融界,经济界,计算机方向等;
  • 计算机行业;
  • 金融行业;
  • 教育培训行业;
  • 其他行业。

除了可以继续从事本专业之外,其他方向无论是金融还是计算机都可以转。

结论: 整体来看,其实如果自身条件 OK 的话,并且也愿意在本科期间选择数学专业的话。其实选择数学专业是一个不错的选择。在数学系本科这几年可以根据自身的情况来继续选择合适自己的发展方向,并且在研究生或者工作的时候选择一个适合自己的舞台。

Riemann Zeta 函数(二)

在上一篇文章里面,我们已经给出了 Riemann Zeta 函数的定义,

\zeta(s) = \sum_{n=1}^{\infty} \frac{1}{n^s}.

其定义域是 [1,\infty)\subseteq\mathbb{R}. 根据级数与定积分的等价关系可以得到:

  1. s = 1 时,\zeta(1) = \infty;
  2. s>1 时,\zeta(s)<\infty.

本文将会重点讲两个内容:

  1. 如何把 Riemann Zeta 函数从 [1,\infty)\subseteq \mathbb{R} 上延拓到 \{s\in \mathbb{C}: \Re(s)>0\} 上;
  2. Riemann Zeta 函数在 \{s\in\mathbb{C}: \Re(s)\geq 1\} 上没有零点。

Riemann Zeta 函数定义域的延拓

如果想把 Riemann Zeta 函数的定义域从 [1,\infty)\subseteq \mathbb{R} 延拓到更大的区域 \{s\in\mathbb{C}:\Re(s)>0\} 上,就需要给出 Riemann Zeta 函数在 \{s\in \mathbb{C}: \Re(s)>0\} 上的定义。而且在原始的定义域 [1,\infty)\subseteq\mathbb{R} 上面,新的函数的取值必须与原函数的取值保持一致。

首先,我们将会在 [1,\infty)\subseteq \mathbb{R} 上面证明如下恒等式:

\zeta(s) = \frac{s}{s-1} - s\int_{1}^{\infty}\frac{\{x\}}{x^{s+1}}dx.

证明:当 s=1 时,上述等式显然成立,两侧都是 \infty.

\frac{s}{s-1}-s\int_{1}^{\infty}\frac{\{x\}}{x^{s+1}}dx

= \frac{s}{s-1} - s\sum_{n=1}^{\infty}\int_{n}^{n+1}\frac{\{x\}}{x^{s+1}}dx 

= \frac{s}{s-1} - s\sum_{n=1}^{\infty}\int_{n}^{n+1}\frac{x-n}{x^{s+1}}dx 

= \frac{s}{s-1} - s\sum_{n=1}^{\infty}\bigg(\int_{n}^{n+1}\frac{1}{x^{s}}dx - \int_{n}^{n+1}\frac{n}{x^{s+1}}dx\bigg)

= \frac{s}{s-1} - s\int_{1}^{\infty}\frac{1}{x^{s}}dx + \sum_{n=1}^{\infty}n\cdot\int_{n}^{n+1}\frac{s}{x^{s+1}}dx

= \sum_{n=1}^{\infty}n\cdot\bigg(\frac{1}{n^{s}}-\frac{1}{(n+1)^{s}}\bigg)

= \sum_{n=1}^{\infty}\bigg(\frac{1}{n^{s-1}}-\frac{1}{(n+1)^{s-1}} + \frac{1}{(n+1)^{s}}\bigg)

= \sum_{n=1}^{\infty}\frac{1}{n^{s}}.

从右式的表达式

\frac{s}{s-1} - s \int_{1}^{\infty}\frac{\{x\}}{x^{s+1}}dx

可以看出 \zeta(s) 可以延拓到 \{s \in\mathbb{C}:\Re(s)>0\} 上。而且右侧的函数在 \{s\in\mathbb{C}:\Re(s)>0,s\neq 1\} 是解析的,并且 s=1 是该函数的一个极点。进一步的分析可以得到,我们得到一个关于 (s-1)\zeta(s) 的解析函数,而且 \lim_{s\rightarrow 1}(s-1)\zeta(s)=1. 综上所述:

  1. Riemann Zeta 函数可以延拓到 \{s\in\mathbb{C}:\Re(s)>0\} 上;
  2. Riemann Zeta 函数在 \{s\in\mathbb{C}:\Re(s)>0, s\neq 1\} 上是解析的;s=1 是 Riemann Zeta 函数的极点。

 

Riemann Zeta 函数的非零区域

著名的 Riemann 猜想说的是 \zeta(s) 函数的所有非平凡零点都在直线 \{s\in\mathbb{C}:\Re(s)=1/2\} 上。因此,数学家首先要找出的就是 Riemann Zeta 函数的非零区域。而本篇文章将会证明 Riemann Zeta 函数在 \{s\in\mathbb{C}:\Re(s)\geq 1\} 上面没有零点。

\Re(s)>1 区域

首先,我们要证明当 \Re(s)>1 时,\zeta(s)\neq 0.

在这里,就需要使用一个重要的恒等式:当 \Re(s)>1 时,

\zeta(s) =\sum_{n=1}^{\infty}\frac{1}{n^{s}}

= \prod_{p}\bigg(1+\frac{1}{p^{s}}+\frac{1}{p^{2s}}+\cdots\bigg)

= \prod_{n=1}^{\infty}\bigg(1-\frac{1}{p_{n}^{s}}\bigg)^{-1},

其中这里的 p 表示所有的素数相乘,而 p_{n} 表示第 n 个素数。

下面我们证明:

\bigg|1-\frac{1}{p_{n}^{s}}\bigg|^{-1}\geq 1-\frac{1}{p_{n}^{\sigma}-1} .

事实上,令 s = \sigma + i t,,当 \sigma=\Re(s)>1 时,我们有

\bigg|1-\frac{1}{p_{n}^{s}}\bigg|^{-1} = \bigg(1+\frac{1}{p_{n}^{s}}+\frac{1}{p_{n}^{2s}}+\cdots\bigg)

\geq 1-\frac{1}{|p_{n}^{s}|}- \frac{1}{|p_{n}^{2s}|} -\cdots

= 1- \frac{1}{p_{n}^{\sigma}} - \frac{1}{p_{n}^{2\sigma}} -\cdots

= 1- \frac{1}{p_{n}^{\sigma}-1}.

因此,

|\zeta(s)| \geq \prod_{n=1}^{\infty}\bigg|1-\frac{1}{p_{n}^{s}}\bigg|^{-1} \geq\prod_{n=1}^{\infty}\bigg(1-\frac{1}{p_{n}^{\sigma}-1}\bigg).

同时,

\lim_{n\rightarrow \infty} \bigg(1- \frac{1}{p_{n}^{\sigma}-1}\bigg) = 1 ,

1-\frac{1}{p_{n+1}^{\sigma}-1} \geq 1- \frac{1}{p_{n}^{\sigma}-1} ,

\sum_{n=1}^{\infty}\frac{1}{p_{n}^{\sigma}}\leq \sum_{n=1}^{\infty}\frac{1}{n^{\sigma}}<\infty when \sigma>1.

所以,当 \Re(s)>1 时,\zeta(s) \neq 0.

\Re(s) =1 直线

Claim 1. 下面我们将会证明恒等式:对于 \sigma >1, \text{ } t\in\mathbb{R},

\Re(\ln\zeta(\sigma + it)) = \sum_{n=2}^{\infty}\frac{\Lambda(n)}{n^{\sigma}\ln(n)}\cos(t\ln(n)) ,

其中当 n 形如 p^{\alpha}, p 是素数,\alpha \geq 1. \Lambda(n) = \ln(p). 而对于其余的 n, \Lambda(n)=0.

事实上,根据 Euler 公式,

\zeta(s) = \prod_{p}\bigg(1-\frac{1}{p^{s}}\bigg)^{-1}.

s = \sigma + it, 可以得到

\ln\zeta(s) = -\sum_{p}\ln\bigg(1-\frac{1}{p^{s}}\bigg)

= \sum_{p}\sum_{\alpha=1}^{\infty}\frac{1}{\alpha p^{\alpha s}}

= \sum_{p}\sum_{\alpha=1}^{\infty}\frac{1}{\alpha p^{\alpha\sigma}}\cdot p^{-i\alpha t}

= \sum_{p}\sum_{\alpha = 1}^{\infty}\frac{1}{\alpha p^{\alpha\sigma}}\cdot e^{-i\alpha t \ln p}

进一步,

\Re(\ln\zeta(s)) = \sum_{p}\sum_{\alpha =1}^{\infty}\frac{1}{\alpha p^{\alpha\sigma}}\cos(\alpha t \ln p)

并且右侧等于

RHS = \sum_{n=2}^{\infty}\frac{\Lambda(n)}{n^{\sigma}\ln(n)}\cos(t\ln(n))

= \sum_{p}\sum_{\alpha = 1}^{\infty} \frac{\ln(p)}{p^{\alpha\sigma}\ln(p^{\alpha})}\cos(t\ln(p^{\alpha}))

= \sum_{p}\sum_{\alpha = 1}^{\infty}\frac{1}{\alpha p^{\alpha\sigma}}\cos(\alpha t\ln p).

所以,恒等式成立,Claim 1 证明完毕。

Claim 2.

\Re(3\ln\zeta(\sigma) + 4\ln\zeta(\sigma+it) + \ln\zeta(\sigma+2it))\geq 0,

其中 \sigma>1, t\in\mathbb{R}. 换句话说

|\zeta(\sigma)^{3}\zeta(\sigma+it)^{4}\zeta(\sigma+2it)|\geq 1.

事实上,

从三角函数的性质可以得到:

3+4\cos(\theta)+\cos(2\theta) = 3 + 4\cos(\theta)+2\cos^{2}(\theta)-1

= 2(\cos(\theta)-1)^{2}\geq 0,

所以,从 Claim 1 可以得到

\Re(3\ln\zeta(\sigma) + 4\ln\zeta(\sigma+it) + \ln\zeta(\sigma+2it))

= \sum_{n=2}^{\infty} \frac{\Lambda(n)}{n^{\sigma}\ln(n)} \cdot ( 3 + 4\cos(t\ln(n)) + \cos(2t\ln(n))) \geq 0.

进一步地,使用 \Re(\ln(z)) = \ln(|z|) 可以得到

0\leq 3\ln|\zeta(\sigma)| + 4\ln|\zeta(\sigma+it)| + \ln|\zeta(\sigma+2it)|

= \ln|\zeta(\sigma)^{3}\zeta(\sigma+it)^{4}\zeta(\sigma+2it)|,

可以推导出 |\zeta(\sigma)^{3}\zeta(\sigma+it)^{4}\zeta(\sigma+2it)|\geq 1. 因此 Claim 2 证明完毕。

Claim 3. \zeta(1+it)\neq 0 对于所有的 \{t\in\mathbb{R}: t\neq 0\} 成立。

反证法:假设 \zeta(s)s=\sigma + it (t\neq 0) 存在阶数为 m 的零点。也就是说:

\lim_{\sigma\rightarrow 1^{+}} \frac{\zeta(\sigma+it)}{(\sigma+it-1)^{m}}=c\neq 0, 其中 m\geq 1.

从 Riemann Zeta 函数的延拓可以知道,\lim_{\sigma\rightarrow 1^{+}}(\sigma -1)\zeta(\sigma) = 1. 并且 \zeta(s)\{s\in\mathbb{C}:\Re(s)>0, s\neq 1\} 上是解析函数。

从 Claim 2 可以得到:

|(\sigma-1)^{3}\zeta(\sigma)^{3}(\sigma+it-1)^{-4m}\zeta(\sigma+it)^{4}\zeta(\sigma+2it)|

\geq |\sigma-1|^{3}|\sigma-1+it|^{-4m}

\geq |\sigma-1|^{3}\cdot |\sigma-1|^{-4m}

= \frac{1}{|\sigma-1|^{4m-3}}.

\sigma\rightarrow 1^{+}, 可以得到左侧趋近于一个有限的值,但是右侧趋近于无穷,所以得到矛盾。也就是说当 t\neq 0 时, \zeta(1+it)\neq 0 成立。

根据之前的知识,s= 1\zeta(s) 的极点,所以我们得到了本篇文章的主要结论:\zeta(s)\{s\in\mathbb{C}:\Re(s)\geq 1\} 上面没有零点。

 

总结

本篇文章从 Riemann Zeta 函数的延拓开始,证明了 Riemann Zeta 函数在 \{s\in\mathbb{C}:\Re(s)\geq 1\} 上没有零点。在下一篇文章中,笔者将会证明在 \Re(s)=1 附近一个“狭长”的区域上,Riemann Zeta 函数没有零点。

 

从调和级数到 RIEMANN ZETA 函数(一)

Riemann Zeta 函数

Riemann Zeta 函数(Riemann zeta function),\zeta(s),是一个关于复数 s 的方程。在复平面上,当复数 s 的实数部分 \sigma=\Re s >1 时,\zeta(s) 就是如下的级数形式:

\zeta(s) = \sum_{n=1}^{\infty}\frac{1}{n^{s}}.

调和级数的概念与性质

既然提到了级数,首先让我们来回顾一下级数的定义是什么?

级数的定义:在数学中,一个有穷或者无穷的序列 (x_{0},x_{1},x_{2},...) 的形式和 S = x_{0}+x_{1}+x_{2}+... 称为级数,里面的每一项都称为级数的通项。

级数收敛的定义:令 S_{n}=x_{0}+...+x_{n},如果存在有限的 S 使得 \lim_{n\rightarrow \infty}S_{n}=S,那么就称该级数收敛。否则,该级数就称为发散级数。

然后下面我们来研究一下调和级数的基本性质。调和级数的表达式写出来十分简单,那就是 Riemann Zeta 函数在 s=1 的取值,i.e.

\zeta(1) = \sum_{n=1}^{+\infty}\frac{1}{n}.

提到级数的收敛或发散,就必须要提到关于级数收敛的等价定理(Cauchy 判别法),那就是:级数 S_{n} 收敛当且仅当对任意的 \epsilon>0,存在 N 使得对于任意的 m, n>N 都有 |S_{m}-S_{n}|<\epsilon.

既然是等价定理,那么就可以使用 Cauchy 判别法来判断调和级数是否收敛。

Method 1.

S_{n}=\sum_{k=1}^{n}\frac{1}{k},

直接通过计算得到

|S_{2n}-S_{n}|=\frac{1}{n+1}+...+\frac{1}{2n}>\frac{1}{2n}+...+\frac{1}{2n}=\frac{1}{2},

说明该级数是不收敛的,也就是调和级数是发散的。

除了基于 Cauchy 收敛准则的证明之外,能否写出判断调和级数发散的其他方法呢?答案是肯定的。以下有一种使用初等数学方法就能够解释调和级数发散的方法。

Method 2.

\sum_{n=1}^{+\infty}\frac{1}{n}

=1+\frac{1}{2}+(\frac{1}{3}+\frac{1}{4})+(\frac{1}{5}+\frac{1}{6}+\frac{1}{7}+\frac{1}{8})+...

>1+\frac{1}{2}+(\frac{1}{4}+\frac{1}{4})+(\frac{1}{8}+\frac{1}{8}+\frac{1}{8}+\frac{1}{8})+...

=1+\frac{1}{2}+\frac{1}{2}+\frac{1}{2}+...=+\infty.

既然都提到了高等数学,那么当然不能仅仅局限于使用初等数学的技巧来解决问题。而且如果只是用初等数学的方法,在拓展性方面就会受到极大的限制。

Method 3. 调和级数的发散可以通过定积分的技巧来进行解决。

HarmonicSeries

1+\frac{1}{2}+...+\frac{1}{n}

>\int_{1}^{2}\frac{1}{x}dx + \int_{2}^{3}\frac{1}{x}dx+...+\int_{n}^{n+1}\frac{1}{x}dx

=\int_{1}^{n+1}\frac{1}{x}dx=\ln(n+1)

因此,\sum_{n=1}^{\infty}\frac{1}{n}=+\infty.

从上面的定积分的方法可以预计出调和级数的量级大约是对数的量级,那么能否精确的估计出来呢?例如下面这个问题:

问题:\lim_{n\rightarrow +\infty}\frac{\sum_{k=1}^{n}\frac{1}{k}}{\ln(n)}=?

通过 L’Hospital 法则可知:\lim_{x\rightarrow 0}x/\ln(1+x)=1.

通过 Stolz 定理可知:

\lim_{n\rightarrow +\infty}\frac{\sum_{k=1}^{n}\frac{1}{k}}{\ln(n)}

= \lim_{n\rightarrow +\infty}\frac{\frac{1}{n}}{\ln(n/(n-1))}

= \lim_{x\rightarrow 0}\frac{x}{\ln(1+x)}=1

除此之外,我们同样可以证明

\lim_{n\rightarrow+\infty}(1+\frac{1}{2}+...+\frac{1}{n}-\ln(n))

这个极限是存在并且有限的。

调和级数的推广

那么,如果在考虑 \zeta(2) 也就是级数

\zeta(2) = \sum_{n=1}^{\infty}\frac{1}{n^{2}}

是否收敛的时候,能否用到以上类似的技巧呢?首先,确实也存在各种各样的初等数学技巧,例如:

Method 1.

\sum_{n=1}^{+\infty}\frac{1}{n^{2}}<1+\sum_{n=2}^{+\infty}\frac{1}{n(n-1)}=1+\sum_{n=2}^{+\infty}(\frac{1}{n-1}-\frac{1}{n})=2.

Method 2. 使用数学归纳法。也就是要证明:

\sum_{k=1}^{n}1/k^{2}\leq 2-\frac{1}{n}.

n=1 的时候,公式是正确的。假设 n 的时候是正确的,那么我们有\sum_{k=1}^{n}1/k^{2}\leq 2-\frac{1}{n}。计算可得:

\sum_{k=1}^{n+1}\frac{1}{k^{2}}

<2-\frac{1}{n}+\frac{1}{(n+1)^{2}}

= 2- \frac{1}{n+1}-\frac{1}{n(n+1)^{2}}

\leq 2-\frac{1}{n+1}.

因此,不等式正确,所以 \sum_{n=1}^{+\infty}1/n^{2} 收敛。

其次,在判断调和级数发散的时候,使用的定积分的方法同样可以应用在这个场景下。

Method 3.

1+\frac{1}{2^{2}}+...+\frac{1}{n^{2}}

<1+\int_{1}^{2}\frac{1}{x^{2}}dx+...+\int_{n-1}^{n}\frac{1}{x^{2}}dx

=1+\int_{1}^{n}\frac{1}{x^{2}}dx=1+1-\frac{1}{n}<2.

那么这个是针对次数等于2的情况,对于一般的情形,

\zeta(s)=\sum_{n=1}^{+\infty}\frac{1}{n^{s}},\sigma = \Re(s)>1.

使用定积分的技术,同样可以证明对于任意的 \sigma = \Re(s)>1,都有 \zeta(s) 是收敛的。但是 \zeta(1) 是发散的。

Riemann Zeta 函数中某些点的取值

除此之外,既然 \zeta(s)\sigma = \Re(s)>1 的时候收敛,能否计算出某些函数的特殊值呢?答案是肯定的,例如,我们可以使用 Fourier 级数来计算出 \zeta(2), \zeta(4), \zeta(6),... 的取值。首先,我们回顾一下 Fourier 级数的一些性质:

假设 f(x) 是一个关于 2\pi 的周期函数, i.e. f(x)=f(x+2\pi) 对于所有的 x \in \mathbb{R} 都成立。那么函数 f(x) 的 Fourier 级数就定义为

a_{0}+\sum_{n=1}^{\infty} (a_{n} \cos(nx) +b_{n} \sin(nx)),

其中,a_{0}= \frac{1}{2\pi} \int_{-\pi}^{\pi} f(x) dx,

a_{n}= \frac{1}{\pi} \int_{-\pi}^{\pi} f(x) \cos(nx) dx n\geq 1,

b_{n}= \frac{1}{\pi} \int_{-\pi}^{\pi} f(x) \sin(nx) dx n\geq 1,

定理 1. 如果 f(x) 在区间 (-\pi, \pi) 上满足 Lipschitz 条件,那么

f(x) =a_{0}+\sum_{n=1}^{\infty} (a_{n} \cos(nx) +b_{n} \sin(nx)).

定理 2. Parseval’s 恒等式.

\frac{1}{\pi} \int_{-\pi}^{\pi} |f(x)|^{2} dx= 2a_{0}^{2}+ \sum_{n=1}^{\infty} (a_{n}^{2}+b_{n}^{2}).

下面我们就来证明下列恒等式:

\sum_{n=1}^{\infty} \frac{1}{(2n-1)^{2}}=\frac{\pi^{2}}{8}

\sum_{n=1}^{\infty} \frac{1}{n^{2}}=\frac{\pi^{2}}{6}

\sum_{n=1}^{\infty} \frac{1}{(2n-1)^{4}}=\frac{\pi^{4}}{96}

\sum_{n=1}^{\infty} \frac{1}{n^{4}}=\frac{\pi^{4}}{90}

证明:

选择在区间 (-\pi, \pi) 上的函数 f(x)=|x|,并且该函数是关于 2\pi 的周期函数。

使用 a_{n}b_{n} 的公式,我们可以得到函数 f(x)=|x| 的 Fourier 级数是

\frac{\pi}{2} + \sum_{n=1}^{\infty} \frac{2((-1)^{n}-1)}{\pi} \cdot \frac{cos(nx)}{n^{2}}

从定理1, 令 x=0, 可以得到

0= \frac{\pi}{2} + \sum_{n=1}^{\infty} \frac{2((-1)^{n}-1)}{n^{2} \pi} = \frac{\pi}{2} + \sum_{m=1}^{\infty} \frac{-4}{(2m-1)^{2}\pi} = \frac{\pi}{2} - \frac{4}{\pi} \sum_{m=1}^{\infty} \frac{1}{(2m-1)^{2}}

因此,\sum_{n=1}^{\infty} \frac{1}{(2n-1)^{2}}=\frac{\pi^{2}}{8} .

假设 S=\sum_{n=1}^{\infty} \frac{1}{n^{2}} , 可以得到

S=\sum_{odd} \frac{1}{n^{2}} + \sum_{even} \frac{1}{n^{2}} = \frac{\pi^{2}}{8} + \frac{1}{4} S .

因此 S=\frac{\pi^{2}}{6} .

从 Parserval’s 恒等式,我们知道

\frac{2\pi^{2}}{3}= \frac{1}{\pi} \int_{-\pi}^{\pi} x^{2}dx = 2\cdot (\frac{\pi}{2})^{2} + \sum_{n=1}^{\infty} \frac{4((-1)^{n}-1)^{2}}{\pi^{2}\cdot n^{4}} = \frac{\pi^{2}}{2} + \sum_{m=1}^{\infty} \frac{16}{\pi^{2} (2m-1)^{4}}

因此 \sum_{n=1}^{\infty} \frac{1}{(2n-1)^{4}} = \frac{\pi^{4}}{96} .

假设 S=\sum_{n=1}^{\infty} \frac{1}{n^{4}} , 得到

S=\sum_{odd} \frac{1}{n^{4}} + \sum_{even} \frac{1}{n^{4}} = \frac{\pi^{4}}{96} + \frac{1}{16} S

因此, S=\frac{\pi^{4}}{90} .

总结

本篇文章从调和级数的发散性开始,介绍了判断调和级数是否收敛的几种方法。进一步考虑了其他级数的收敛性,并通过 Fourier 级数的方法计算出了部分 Riemann Zeta 函数的取值。

Hausdorff dimension of the graphs of the classical Weierstrass functions

In this paper, we obtain the explicit value of the Hausdorff dimension of the graphs of the classical Weierstrass functions, by proving absolute continuity of the SRB measures of the associated solenoidal attractors.

1. Introduction

In Real Analysis, the classical Weierstrass function is

\displaystyle W_{\lambda,b}(x) = \sum\limits_{n=0}^{\infty} \lambda^n \cos(2\pi b^n x)

with {1/b < \lambda < 1}.

Note that the Weierstrass functions have the form

\displaystyle f^{\phi}_{\lambda,b}(x) = \sum\limits_{n=0}^{\infty} \lambda^n \phi(b^n x)

where {\phi} is a {\mathbb{Z}}-periodic {C^2}-function.

Weierstrass (1872) and Hardy (1916) were interested in {W_{\lambda,b}} because they are concrete examples of continuous but nowhere differentiable functions.

Remark 1 The graph of {f^{\phi}_{\lambda,b}} tends to be a “fractal object” because {f^{\phi}_{\lambda,b}} is self-similar in the sense that

\displaystyle f^{\phi}_{\lambda, b}(x) = \phi(x) + \lambda f^{\phi}_{\lambda,b}(bx)

We will come back to this point later.

Remark 2 {f^{\phi}_{\lambda,b}} is a {C^{\alpha}}-function for all {0\leq \alpha < \frac{-\log\lambda}{\log b}}. In fact, for all {x,y\in[0,1]}, we have

\displaystyle \frac{f^{\phi}_{\lambda, b}(x) - f^{\phi}_{\lambda,b}(y)}{|x-y|^{\alpha}} = \sum\limits_{n=0}^{\infty} \lambda^n b^{n\alpha} \left(\frac{\phi(b^n x) - \phi(b^n y)}{|b^n x - b^n y|^{\alpha}}\right),

so that

\displaystyle \frac{f^{\phi}_{\lambda, b}(x) - f^{\phi}_{\lambda,b}(y)}{|x-y|^{\alpha}} \leq \|\phi\|_{C^{\alpha}} \sum\limits_{n=0}^{\infty}(\lambda b^{\alpha})^n:=C(\phi,\alpha,\lambda,b) < \infty

whenever {\lambda b^{\alpha} < 1}, i.e., {\alpha < -\log\lambda/\log b}.

The study of the graphs of {W_{\lambda,b}} as fractal sets started with the work of Besicovitch-Ursell in 1937.

Remark 3 The Hausdorff dimension of the graph of a {C^{\alpha}}-function {f:[0,1]\rightarrow\mathbb{R}}is

\displaystyle \textrm{dim}(\textrm{graph}(f))\leq 2 - \alpha

Indeed, for each {n\in\mathbb{N}}, the Hölder continuity condition

\displaystyle |f(x)-f(y)|\leq C|x-y|^{\alpha}

leads us to the “natural cover” of {G=\textrm{graph}(f)} by the family {(R_{j,n})_{j=1}^n} of rectangles given by

\displaystyle R_{j,n}:=\left[\frac{j-1}{n}, \frac{j}{n}\right] \times \left[f(j/n)-\frac{C}{n^{\alpha}}, f(j/n)+\frac{C}{n^{\alpha}}\right]

Nevertheless, a direct calculation with the family {(R_{j,n})_{j=1}^n} does not give us an appropriate bound on {\textrm{dim}(G)}. In fact, since {\textrm{diam}(R_{j,n})\leq 4C/n^{\alpha}} for each {j=1,\dots, n}, we have

\displaystyle \sum\limits_{j=1}^n\textrm{diam}(R_{j,n})^d\leq n\left(\frac{4C}{n^{\alpha}}\right)^d = (4C)^{1/\alpha} < \infty

for {d=1/\alpha}. Because {n\in\mathbb{N}} is arbitrary, we deduce that {\textrm{dim}(G)\leq 1/\alpha}. Of course, this bound is certainly suboptimal for {\alpha<1/2} (because we know that {\textrm{dim}(G)\leq 2 < 1/\alpha} anyway).Fortunately, we can refine the covering {(R_{j,n})} by taking into account that each rectangle {R_{j,n}} tends to be more vertical than horizontal (i.e., its height {2C/n^{\alpha}} is usually larger than its width {1/n}). More precisely, we can divide each rectangle {R_{j,n}} into {\lfloor n^{1-\alpha}\rfloor} squares, say

\displaystyle R_{j,n} = \bigcup\limits_{k=1}^{\lfloor n^{1-\alpha}\rfloor}Q_{j,n,k},

such that every square {Q_{j,n,k}} has diameter {\leq 2C/n}. In this way, we obtain a covering {(Q_{j,n,k})} of {G} such that

\displaystyle \sum\limits_{j=1}^n\sum\limits_{k=1}^{\lfloor n^{1-\alpha}\rfloor} \textrm{diam}(Q_{j,n,k})^d \leq n\cdot n^{1-\alpha}\cdot\left(\frac{2}{n}\right)^d\leq (2C)^{2-\alpha}<\infty

for {d=2-\alpha}. Since {n\in\mathbb{N}} is arbitrary, we conclude the desired bound

\displaystyle \textrm{dim}(G)\leq 2-\alpha

A long-standing conjecture about the fractal geometry of {W_{\lambda,b}} is:

Conjecture (Mandelbrot 1977): The Hausdorff dimension of the graph of {W_{\lambda,b}} is

\displaystyle 1<\textrm{dim}(\textrm{graph}(W_{\lambda,b})) = 2 + \frac{\log\lambda}{\log b} < 2

Remark 4 In view of remarks 2 and 3, the whole point of Mandelbrot’s conjecture is to establish the lower bound

\displaystyle \textrm{dim}(\textrm{graph}(W_{\lambda,b})) \geq 2 + \frac{\log\lambda}{\log b}

Remark 5 The analog of Mandelbrot conjecture for the box and packing dimensions is known to be true: see, e.g., these papers here and here).

In a recent paper (see here), Shen proved the following result:

Theorem 1 (Shen) For any {b\geq 2} integer and for all {1/b < \lambda < 1}, the Mandelbrot conjecture is true, i.e.,

\displaystyle \textrm{dim}(\textrm{graph}(W_{\lambda,b})) = 2 + \frac{\log\lambda}{\log b}

Remark 6 The techniques employed by Shen also allow him to show that given {\phi:\mathbb{R}\rightarrow\mathbb{R}} a {\mathbb{Z}}-periodic, non-constant, {C^2} function, and given {b\geq 2} integer, there exists {K=K(\phi,b)>1} such that

\displaystyle \textrm{dim}(\textrm{graph}(f^{\phi}_{\lambda,b})) = 2 + \frac{\log\lambda}{\log b}

for all {1/K < \lambda < 1}.

Remark 7 A previous important result towards Mandelbrot’s conjecture was obtained by Barańsky-Barány-Romanowska (in 2014): they proved that for all {b\geq 2} integer, there exists {1/b < \lambda_b < 1} such that

\displaystyle \textrm{dim}(\textrm{graph}(W_{\lambda,b})) = 2 + \frac{\log\lambda}{\log b}

for all {\lambda_b < \lambda < 1}.

The remainder of this post is dedicated to give some ideas of Shen’s proof of Theorem1 by discussing the particular case when {1/b<\lambda<2/b} and {b\in\mathbb{N}} is large.

2. Ledrappier’s dynamical approach

If {b\geq 2} is an integer, then the self-similar function {f^{\phi}_{\lambda,b}} (cf. Remark 1) is also {\mathbb{Z}}-periodic, i.e., {f^{\phi}_{\lambda,b}(x+1) = f^{\phi}_{\lambda,b}(x)} for all {x\in\mathbb{R}}. In particular, if {b\geq 2} is an integer, then {\textrm{graph}(f^{\phi}_{\lambda,b})} is an invariant repeller for the endomorphism {\Phi:\mathbb{R}/\mathbb{Z}\times\mathbb{R}\rightarrow \mathbb{R}/\mathbb{Z}\times\mathbb{R}} given by

\displaystyle \Phi(x,y) = \left(bx\textrm{ mod }1, \frac{y-\phi(x)}{\lambda}\right)

This dynamical characterization of {G = \textrm{graph}(f^{\phi}_{\lambda,b})} led Ledrappier to the following criterion for the validity of Mandelbrot’s conjecture when {b\geq 2} is an integer.

Denote by {\mathcal{A}} the alphabet {\mathcal{A}=\{0,\dots,b-1\}}. The unstable manifolds of {\Phi}through {G} have slopes of the form

\displaystyle (1,-\gamma \cdot s(x,u))

where {\frac{1}{b} < \gamma = \frac{1}{\lambda b} <1}, {x\in\mathbb{R}}, {u\in\mathcal{A}^{\mathbb{N}}}, and

\displaystyle s(x,u):=\sum\limits_{n=0}^{\infty} \gamma^n \phi'\left(\frac{x + u_1 + u_2 b + \dots + u_n b^{n-1}}{b^n}\right)

In this context, the push-forwards {m_x := (u\mapsto s(x,u))_*\mathbb{P}} of the Bernoulli measure {\mathbb{P}} on {\mathcal{A}^{\mathbb{N}}} (induced by the discrete measure assigning weight {1/b} to each letter of the alphabet {\mathcal{A}}) play the role of conditional measures along vertical fibers of the unique Sinai-Ruelle-Bowen (SRB) measure {\theta} of the expanding endomorphism {T:\mathbb{R}/\mathbb{Z}\times\mathbb{R} \rightarrow \mathbb{R}/\mathbb{Z}\times\mathbb{R}},

\displaystyle T(x,y) = (bx\textrm{ mod }1, \gamma y + \psi(x)),

where {\gamma=1/\lambda b} and {\psi(x)=\phi'(x)}. In plain terms, this means that

\displaystyle \theta = \int_{\mathbb{R}/\mathbb{Z}} m_x \, d\textrm{Leb}(x) \ \ \ \ \ (1)

where {\theta} is the unique {T}-invariant probability measure which is absolutely continuous along unstable manifolds (see Tsujii’s paper).

As it was shown by Ledrappier in 1992, the fractal geometry of the conditional measures {m_x} have important consequences for the fractal geometry of the graph {G}:

Theorem 2 (Ledrappier) Suppose that for Lebesgue almost every {x\in\mathbb{R}} the conditional measures {m_x} have dimension {\textrm{dim}(m_x)=1}, i.e.,

\displaystyle \lim\limits_{r\rightarrow 0}\frac{\log m_x(B(z,r))}{\log r} = 1 \textrm{ for } m_x\textrm{-a.e. } z

Then, the graph {G=\textrm{graph}(f^{\phi}_{\lambda,b})} has Hausdorff dimension

\displaystyle \textrm{dim}(G) = 2 + \frac{\log\lambda}{\log b}

Remark 8 Very roughly speaking, the proof of Ledrappier theorem goes as follows. By Remark 4, it suffices to prove that {\textrm{dim}(G)\geq 2 + \frac{\log\lambda}{\log b}}. By Frostman lemma, we need to construct a Borel measure {\nu} supported on {G} such that

\displaystyle \underline{\textrm{dim}}(\nu) := \textrm{ ess }\inf \underline{d}(\nu,x) \geq 2 + \frac{\log\lambda}{\log b}

where {\underline{d}(\nu,x):=\liminf\limits_{r\rightarrow 0}\log \nu(B(x,r))/\log r}. Finally, the main point is that the assumptions in Ledrappier theorem allow to prove that the measure {\mu^{\phi}_{\lambda, b}} given by the lift to {G} of the Lebesgue measure on {[0,1]} via the map {x\mapsto (x,f^{\phi}_{\lambda,b}(x))}satisfies

\displaystyle \underline{\textrm{dim}}(\mu^{\phi}_{\lambda,b}) \geq 2 + \frac{\log\lambda}{\log b}

An interesting consequence of Ledrappier theorem and the equation 1 is the following criterion for Mandelbrot’s conjecture:

Corollary 3 If {\theta} is absolutely continuous with respect to the Lebesgue measure {\textrm{Leb}_{\mathbb{R}^2}}, then

\displaystyle \textrm{dim}(G) = 2 + \frac{\log\lambda}{\log b}

Proof: By (1), the absolute continuity of {\theta} implies that {m_x} is absolutely continuous with respect to {\textrm{Leb}_{\mathbb{R}}} for Lebesgue almost every {x\in\mathbb{R}}.

Since {m_x\ll \textrm{Leb}_{\mathbb{R}}} for almost every {x} implies that {\textrm{dim}(m_x)=1} for almost every {x}, the desired corollary now follows from Ledrappier’s theorem. \Box

3. Tsujii’s theorem

The relevance of Corollary 3 is explained by the fact that Tsujii found an explicittransversality condition implying the absolute continuity of {\theta}.

More precisely, Tsujii firstly introduced the following definition:

Definition 4

  • Given {\varepsilon>0}, {\delta>0} and {x_0\in\mathbb{R}/\mathbb{Z}}, we say that two infinite words {u, v\in\mathcal{A}^{\mathbb{N}}} are {(\varepsilon,\delta)}-transverse at {x_0} if either

    \displaystyle |s(x_0,u)-s(x_0,v)|>\varepsilon

    or

    \displaystyle |s'(x_0,u)-s'(x_0,v)|>\delta

  • Given {q\in\mathbb{N}}, {\varepsilon>0}, {\delta>0} and {x_0\in\mathbb{R}/\mathbb{Z}}, we say that two finite words {k,l\in\mathcal{A}^q} are {(\varepsilon,\delta)}-transverse at {x_0} if {ku}, {lv} are {(\varepsilon,\delta)}-transverse at {x_0}for all pairs of infinite words {u,v\in\mathcal{A}^{\mathbb{N}}}; otherwise, we say that {k} and {l} are{(\varepsilon,\delta)}-tangent at {x_0};
  • {E(q,x_0;\varepsilon,\delta):= \{(k,l)\in\mathcal{A}^q\times\mathcal{A}^q: (k,l) \textrm{ is } (\varepsilon,\delta)\textrm{-tangent at } x_0\}}
  • {E(q,x_0):=\bigcap\limits_{\varepsilon>0}\bigcap\limits_{\delta>0} E(q,x_0;\varepsilon,\delta)};
  • {e(q,x_0):=\max\limits_{k\in\mathcal{A}^q}\#\{l\in\mathcal{A}^q: (k,l)\in E(q,x_0)\}}
  • {e(q):=\max\limits_{x_0\in\mathbb{R}/\mathbb{Z}} e(q,x_0)}.

Next, Tsujii proves the following result:

Theorem 5 (Tsujii) If there exists {q\geq 1} integer such that {e(q)<(\gamma b)^q}, then

\displaystyle \theta\ll\textrm{Leb}_{\mathbb{R}^2}

Remark 9 Intuitively, Tsujii’s theorem says the following. The transversality condition {e(q)<(\gamma b)^q} implies that the majority of strong unstable manifolds {\ell^{uu}}are mutually transverse, so that they almost fill a small neighborhood {U} of some point {x_0} (see the figure below extracted from this paper of Tsujii). Since the SRB measure {\theta} is absolutely continuous along strong unstable manifolds, the fact that the {\ell^{uu}}‘s almost fill {U} implies that {\theta} becomes “comparable” to the restriction of the Lebesgue measure {\textrm{Leb}_{\mathbb{R}^2}} to {U}.

tsujiiacta

Remark 10 In this setting, Barańsky-Barány-Romanowska obtained their main result by showing that, for adequate choices of the parameters {\lambda} and {b}, one has {e(1)=1}. Indeed, once we know that {e(1)=1}, since {1<\gamma b}, they can apply Tsujii’s theorem and Ledrappier’s theorem (or rather Corollary 3) to derive the validity of Mandelbrot’s conjecture for certain parameters {\lambda} and {b}.

For the sake of exposition, we will give just a flavor of the proof of Theorem 1 by sketching the derivation of the following result:

Proposition 6 Let {\phi(x) = \cos(2\pi x)}. If {1/2<\gamma=1/\lambda b <1} and {b\in\mathbb{N}} is sufficiently large, then

\displaystyle e(1)<\gamma b

In particular, by Corollary 3 and Tsujii’s theorem, if {1/2<\gamma=1/\lambda b <1} and {b\in\mathbb{N}} is sufficiently large, then Mandelbrot’s conjecture is valid, i.e.,

\displaystyle \textrm{dim}(W_{\lambda,b}) = 2+\frac{\log\lambda}{\log b}

Remark 11 The proof of Theorem 1 in full generality (i.e., for {b\geq 2} integer and {1/b<\lambda<1}) requires the introduction of a modified version of Tsujii’s transversality condition: roughly speaking, Shen defines a function {\sigma(q)\leq e(q)}(inspired from Peter-Paul inequality) and he proves

  • (a) a variant of Proposition 6: if {b\geq 2} integer and {1/b<\lambda<1}, then {\sigma(q)<(\gamma b)^q} for some integer {q};
  • (b) a variant of Tsujii’s theorem: if {\sigma(q)<(\gamma b)^q} for some integer {q}, then {\theta\ll\textrm{Leb}_{\mathbb{R}^2}}.

See Sections 2, 3, 4 and 5 of Shen’s paper for more details.

We start the (sketch of) proof of Proposition 6 by recalling that the slopes of unstable manifolds are given by

\displaystyle s(x,u):=-2\pi\sum\limits_{n=0}^{\infty} \gamma^n \sin\left(2\pi\frac{x + u_1 + u_2 b + \dots + u_n b^{n-1}}{b^n}\right)

for {x\in\mathbb{R}}, {u\in\mathcal{A}^{\mathbb{N}}}, so that

\displaystyle s'(x,u)=-4\pi^2\sum\limits_{n=0}^{\infty} \left(\frac{\gamma}{b}\right)^n \cos\left(2\pi\frac{x + u_1 + u_2 b + \dots + u_n b^{n-1}}{b^n}\right)

Remark 12 Since {\gamma/b < \gamma}, the series defining {s'(x,u)} converges faster than the series defining {s(x,u)}.

By studying the first term of the expansion of {s(x,u)} and {s'(x,u)} (while treating the remaining terms as a “small error term”), it is possible to show that if {(k,l)\in E(1,x_0)}, then

\displaystyle \left|\sin\left(2\pi\frac{x_0+k}{b}\right) - \sin\left(2\pi\frac{x_0+l}{b}\right)\right| \leq\frac{2\gamma}{1-\gamma} \ \ \ \ \ (2)

and

\displaystyle \left|\cos\left(2\pi\frac{x_0+k}{b}\right) - \cos\left(2\pi\frac{x_0+l}{b}\right)\right| \leq \frac{2\gamma}{b-\gamma} \ \ \ \ \ (3)

(cf. Lemma 3.2 in Shen’s paper).

Using these estimates, we can find an upper bound for {e(1)} as follows. Take {x_0\in\mathbb{R}/\mathbb{Z}} with {e(1)=e(1,x_0)}, and let {k\in\mathcal{A}} be such that {(k,l_1),\dots,(k,l_{e(1)})\in E(1,x_0)} distinct elements listed in such a way that

\displaystyle \sin(2\pi x_i)\leq \sin(2\pi x_{i+1})

for all {i=1,\dots,e(1)-1}, where {x_i:=(x_0+l_i)/b}.

From (3), we see that

\displaystyle \left|\cos\left(2\pi x_i\right) - \cos\left(2\pi x_{i+1}\right)\right| \leq \frac{4\gamma}{b-\gamma}

for all {i=1,\dots,e(1)-1}.

Since

\displaystyle (\cos(2\pi x_i)-\cos(2\pi x_{i+1}))^2 + (\sin(2\pi x_i)-\sin(2\pi x_{i+1}))^2 = 4\sin^2(\pi(x_i-x_{i+1}))\geq 4\sin^2(\pi/b),

it follows that

\displaystyle |\sin(2\pi x_i)-\sin(2\pi x_{i+1})|\geq \sqrt{4\sin^2\left(\frac{\pi}{b}\right) - \left(\frac{4\gamma}{b-\gamma}\right)^2} \ \ \ \ \ (4)

Now, we observe that

\displaystyle \sqrt{4\sin^2\left(\frac{\pi}{b}\right) - \left(\frac{4\gamma}{b-\gamma}\right)^2} > \frac{4}{b} \ \ \ \ \ (5)

for {b} large enough. Indeed, this happens because

  • {\sqrt{z^2-w^2}>2(z-w)} if {z+w>4(z-w)};
  • {z+w>4(z-w)} if {z/w:=u < 5/3};
  • {\frac{2\sin(\frac{\pi}{b})}{\frac{4\gamma}{b-\gamma}}\rightarrow \frac{2\pi}{4\gamma} (< \frac{5}{3})} as {b\rightarrow\infty}, and {2\sin(\frac{\pi}{b}) - \frac{4\gamma}{b-\gamma} \rightarrow (2\pi-4\gamma)\frac{1}{b} (>\frac{2}{b})} as {b\rightarrow\infty} (here we used {\gamma<1}).

By combining (4) and (5), we deduce that

\displaystyle |\sin(2\pi x_i)-\sin(2\pi x_{i+1})| > 4/b

for all {i=1,\dots, e(1)-1}.

Since {-1\leq\sin(2\pi x_1)\leq\sin(2\pi x_2)\leq\dots\leq\sin(2\pi x_{e(1)})\leq 1}, the previous estimate implies that

\displaystyle \frac{4}{b}(e(1)-1)<\sum\limits_{i=1}^{e(1)-1}(\sin(2\pi x_{i+1}) - \sin(2\pi x_i)) = \sin(2\pi x_{e(1)}) - \sin(2\pi x_1)\leq 2,

i.e.,

\displaystyle e(1)<1+\frac{b}{2}

Thus, it follows from our assumptions ({\gamma>1/2}, {b} large) that

\displaystyle e(1)<1+\frac{b}{2}<\gamma b

This completes the (sketch of) proof of Proposition 6 (and our discussion of Shen’s talk).

从对数学的贡献上来讲,丘成桐有多厉害?

作者:匿名用户
链接:https://www.zhihu.com/question/33463090/answer/116836782
来源:知乎
著作权归作者所有,转载请联系作者获得授权。

补充一下某匿名用户的回答。他只是说了大方面,我来给大家补充一点细节。这些故事都是笔者多年来从不同渠道收集到的,虽然未必准确,却能很好地反映出丘先生高尚的人品,卓越的才能,和为祖国数学事业无私奉献的精神。

1.丘成桐教授不仅有数学才华,还很有商业天赋。他在Boston地区有三十多套房产。因为Harvard是个很有钱的学校,所以有很多闲置的房产,他们会用极低的价格把这些房产卖给教授。丘成桐教授以其杰出的商业眼光,前前后后一共买了三十多套,租给他的博士后,每年盈利不可胜计,真是令人钦佩!后来丘教授又看中了一处房子,但是学校却不愿意批准卖给他,所以他让当时是系主任的Ben Gross教授去询问缘由,后来Gross说,学校得知你在Boston地区有三十多套房产,实在太多了,所以不能卖给你。大家知道,在数学界,要想组织seminar和conference,经费是必不可少的。正因为丘教授有杰出的商业头脑和投资眼光,所以为中国数学的蓬勃发展输入了大量的物质财富,可谓是中国版的Simons。但是他的数学水平又远胜Simons,所以丘教授无愧为古往今来第一大师!

2.丘教授通过这些seminar和conference让大量的中国年轻数学家有了抛头露面和展示自己的机会。虽然这些年轻人的数学水平只可意会,但是相信通过丘教授的帮助会很快发展成为华人数学界的领军人物,继承他的资源和衣钵。近年来,丘教授在中国大陆,中国香港和台湾地区设立了大量的研究所。这些研究所的设立不但给不少人提供了很好的工作机会,也给不少想学数学的年轻人提供了优秀的平台。比如清华大学的丘成桐数学中心,可以说是亚洲第一数学中心,连日本京都的RIMS都是远远不如的,我想即使放到宇宙上也是名列前茅的。在这里我们应该特别欢迎广大二本和三本的数学系学生报考这些研究所,因为丘先生的理念就是要给普通高校热爱数学的学生以机会。

3.丘教授每年都到中国的各所高校讲学,尤其是他开设的几个数学中心,这些讲座传授给年轻人许多高深的数学知识和实用的数学技巧。他演讲的话题包括:数学之美、我的成功经验、Harvard数学系的历史和我的一个不听话的学生等等。内容丰富,发人深省,不但能从中学到数学知识,还能体会到许多做(中国)人的道理。可悲的是,一些反动派受到西方自由思想的荼毒,对这样高质量的讲座却视而不见,拒绝参加,其中包括一些数学界的同行。丘教授知悉此事后,给这些人发了一封邮件,明确要求他们:今后只要是我来你们学校做讲座,所有中国人就必须参加!丘先生的严厉做法很好地整肃了华人数学界的风气,提高了凝聚力。相信在丘先生的领导下,大家一定能鼓足干劲,力争上游,多快好省地建设中国数学!

4.丘教授亲自培养的许多学生都有极高的数学水准,在国际上获得广泛承认,多次荣获重大国际奖项,比如晨兴数学奖、新世界数学奖、陈省身奖之中国版等等。这些学生不仅自己水平惊人,对年轻人也提供了无微不至的关怀和细致周到的帮助。比如,丘教授的不少学生害怕学生没有自己的想法,经常亲自给学生提供idea,来帮助学生找到研究的思路。即使学生不需要也要苦口薄心,再三敦促。这样一来,不仅学生可以发paper,他们自己也因为贡献了一个“关键的”idea而顺便加到了名字,可谓是一举两得的做法。丘教授另一些学生因为害怕国际上一些著名杂志的编辑是势利眼,不让年轻学生单独发paper,所以不惜牺牲自己的名节,主动要求在paper上加名字。这样一来,学生发文章的时候就不会吃亏了。他们为学生的付出令人感动。可悲的是,一些年轻人不但不知道感恩,反而对此感到苦恼。对这样的人,我们就应该毫不犹豫地把他们踢出华人数学界,让他们去落后的西方世界吃点苦头!

5.丘教授掌握了国际上一本极为重要的数学杂志,即Journal of Differential Geometry。这本杂志现在成为许多年轻人展示自己只可意会的数学水平和找到教职的最佳平台。为了方便某些中国学生在杂志上发表论文,丘教授提供了一些非同寻常的便捷渠道。比如文章不用发给编辑,可以直接发给自己,再由他转发给编辑。这样一来,中国数学家的文章就经常出现在顶级杂志上,他们的研究水准得到了空前飞跃!丘教授控制的另一本杂志就是大名鼎鼎的Asian Journal。这本杂志上发表了人类在20世纪到21世纪一些最伟大的数学工作,比如朱熹平教授和曹怀东教授对Poincare猜想的最终证明,封顶了人类一百余年来悬而未决的难题。这篇文章长达300多页,但是经过Asian Journal的编辑不知疲倦的辛勤工作,该论文在极短的时间内就获得了发表。可以看到,丘教授在经营杂志以后,杂志审核文章的效率大大提高了。可以说,正是丘教授勤劳刻苦,生命不息,奋斗不止的精神感召了这些编辑,让他们不再玩忽职守和放松懈怠。

6.丘成桐教授对自己学生的关怀可以说是无微不至。有些学生一时糊涂涉嫌抄袭和剽窃,丘教授知道以后果断采取措施,息事宁人,避免了家丑外扬。中国数学界正是在丘先生的努力下才能铁板一块地团结在一起,大家毫无私心,全心全意为中国数学的发展添砖加瓦。但是有些人却不明白丘教授的苦心,经常在丘教授面前投诉,甚至还写匿名信把事情闹到别的学校。对此,丘教授态度坚决,铁面无私地无视了这些无理要求,可以说很好地体现了一位领袖的英明果决。而那些闹事的逆流虽然可能有一点点数学水平,但是今天也没办法站出来领导数学界了。就是因为某些人只知道做研究和思考数学问题,没有意识到帮助中国数学发展才是更有意义的事。思想境界比起丘教授差的太远了。可以说,丘先生高瞻远瞩,气盖环宇,数风流人物,还看今朝。

7.丘教授对中国学生的关心不仅仅局限在数学系,还遍及到各个非数学领域。从前,只要是中国、香港和台湾去Harvard读数学的学生,丘教授都要亲自过问,热情关怀,把他们一一纳入自己门下。比如某学生要跟Taubes,他会亲自找到Taubes,告诉他,这位学生就托付给你了。这样一来,这些西方数学家慑于丘先生的气魄和威望,就不敢再歧视中国学生了。到了后来,只要去Harvard的中国、香港和台湾学生,无论学什么专业,丘先生都要跟他们打交道。据说他还曾经举办过大型party,邀请Harvard商学院大中华地区的所有学生参加。这些活动使他亲民的形象更加突出,在各界广受好评。相信不久的将来,丘教授会吸引到亚洲其他地区的学生参与他的party。像他这的一代王者,相信任何人都会被他的魅力所感召。毕竟只有深入到人民群众中去,才能发现问题所在。丘教授真不愧为一代明君!

8.丘教授虽然已经接近70高龄,仍然老骥伏枥,近年来在数学研究上非常活跃。仅2015一年就在arxiv贴文23篇,以每个月两篇论文的速度进行高质量的数学研究,这是古往今来其他任何数学家都望尘莫及的!要知道,丘教授作为华人数学界的领袖,每天要处理几百封邮件。熟悉丘教授的朋友们都知道,即使是在seminar上他也要一边摁手机收发邮件,一边听talk。能在如此繁忙的情况下一个月写两篇论文,效率之高真是令人震惊!丘教授还特别注意与年轻人的合作,近年来每篇论文几乎都要提携一些年轻数学家,大度地和他们一起署名发表。由于他提携的年轻数学家太多,很多时候甚至会忘记自己的合作者。比如某韩国数学家之前跟他有合作,到了找教职的时候希望丘教授能帮自己写推荐信,但是丘教授却坦言自己并不认识对方。实际上,丘教授不认识自己的合作者正可以反映出他已经帮助了太多年轻人,以至于自己都想不起来自己干的那些好事!范仲淹说:云山苍苍,江水泱泱,先生之风,山高水长。丘先生年近七旬而笔耕不辍,真可谓吾辈典范!

9.丘成桐教授对于人才优劣的判断也是明察秋毫,一望即知。早先,北大一个学生仗着自己是那一届最优秀的就自不量力,想要去Harvard跟丘教授学数学,丘教授对他说:你水平不行。想跟我也可以,先去Boston待两年,经我考察合格了,再来跟我。这个学生不得已之下去了另一个inferior的学校跟了一个比丘教授差了十万八千里的数学家M。事实证明,这个学生现在虽然出了一点小名,在Yale做教授,但是确实不够资格在Harvard做丘教授的学生:因为他只拿到了晨兴数学银奖,而丘教授的学生一般都是拿金奖的。
还有一次,丘教授的学生,国际著名数学家刘教授的一个学生L经刘教授推荐去Harvard师从丘成桐教授,而刘教授另一个学生不服,认为自己比L优秀。他给丘教授发邮件针对此事发了一大堆牢骚,丘教授立刻把他的邮件转发给了刘教授,叫刘教授严加管教。而事实证明,虽然这个学生目前在自己的领域是一个优秀的数学家,但是比起L来差的太远了,因为后者后来解决了国际上多年悬而未决的Hopf猜想,即使在历史上也要留名的。值得一提的是,L不仅数学了得,他满腔的爱国情怀也令人感动。有一次,Harvard一位教授不小心把台湾说成是一个国家,L立刻站起来,义正辞严地告诉该教授:“台湾是中国不可分割的一部分!”像这样品学兼优的杰出青年正是建设祖国数学事业所需要的人才啊!如果不是丘教授乾纲独断,岂不失之交臂?
像丘成桐教授这样慧眼识金的伯乐正是中国数学界最需要的伟大领袖。只要有了他,没有一个人才会被埋没,没有一个庸才可以投机。野无遗贤,万邦咸宁。天降丘神,万物生明!
——————————————–
先写到这里,丘先生的贡献还有很多,许多细节的地方因为空白太小,都已经写不下了,有待日后慢慢总结整理。作为丘教授的铁杆粉丝,我要告诉学数学的年轻人一个简单的道理:没有丘成桐教授开天辟地,创造了数学这个领域,哪来你们今天的归宿?所以,学好数学固然重要,但是更重要的是坚持丘教授在中国数学界的领导地位,紧密团结在他的周围,为早日把中国建设成数学强国而奋斗!军民团结如一人,试看天下谁能敌!

低维动力系统

One Dimensional Real and Complex Dynamics需要学习的资料:

复分析基础:本科生课程

(1) Complex Analysis, 3rd Edition, Lars V. Ahlfors

(2) Complex Analysis, Elias M. Stein

进阶复分析:研究生课程

(1) Lectures on Riemann Surfaces (GTM 81), Otto Forster

(2) Lectures on Quasiconformal Mappings, Lars V. Ahlfors

实分析基础:本科生课程

(1) Real Analysis, Rudin

(2) Real Analysis, Elias M. Stein

专业书籍:

实动力系统:

(1) One Dimensional Dynamics, Welington de Melo & Sebastian VanStrien

(2) Mathematical Tools for One-Dimensional Dynamics (Cambridge Studies in Advanced Mathematics), Edson de Faria / Welington de Melo

复动力系统:

(3) Dynamics in One Complex Variable, John Milnor

(4) Complex Dynamics, Lennart Carleson

(5) Complex Dynamics and Renormalization, Curtis T. McMullen

(6) Renormalization and 3-Manifolds Which Fiber over the Circle, Curtis T. McMullen

(7) Iteration of rational functions (GTM 132), Alan F. Beardon

遍历论:

(8) An Introduction to Ergodic Theory (GTM 79), Walters Peter

Complex Analysis

学习复分析也已经很多年了,七七八八的也读了不少的书籍和论文。略作总结工作,方便后来人学习参考。 复分析是一门历史悠久的学科,主要是研究解析函数,亚纯函数在复球面的性质。下面一一介绍这些基本内容。

300px-Mandel_zoom_00_mandelbrot_set

(1)提到复变函数,首先需要了解复数 (Complex Numbers) 的基本性质和四则运算规则。怎么样计算复数的平方根,极坐标与xy坐标的转换,复数的模之类的。这些在高中的时候基本上都会学过。

(2)复变函数自然是在复平面上来研究问题,此时数学分析里面的求导数之类的运算就会很自然的引入到复平面里面,从而引出解析函数 (Holomorphic Functions / Analytic Functions) 的定义。那么研究解析函数的性质就是关键所在。最关键的地方就是所谓的Cauchy—Riemann公式,这个是判断一个函数是否是解析函数的关键所在。

(3)明白解析函数的定义以及性质之后,就会把数学分析里面的曲线积分 (Line Integrals) 的概念引入复分析中,定义几乎是一致的。在引入了闭曲线和曲线积分之后,就会有出现复分析中的重要的定理:Cauchy积分公式 (Cauchy’s Integral Formula)。这个是复分析的第一个重要定理。

(4)既然是解析函数,那么函数的定义域 (Domain) 就是一个关键的问题。可以从整个定义域去考虑这个函数,也可以从局部来研究这个函数。这个时候研究解析函数的奇点 (Singularity) 就是关键所在,奇点根据性质分成可去奇点 (Removable Singularity),极点 (Pole),本性奇点 (Essential Singularity) 三类,围绕这三类奇点,会有各自奇妙的定理。

(5)复变函数中,留数定理 (Residue Theorem) 是一个重要的定理,反映了曲线积分和零点极点的性质。与之类似的幅角定理也展示了类似的关系。

(6)除了积分,导数也是解析函数的一个研究方向。导数加上收敛 (Convergence) 的概念就可以引出 Taylor 级数 (Taylor Series) 和 Laurent 级数 (Laurent Series) 的概念。除此之外,正规族 (Normal Families) 里面有一个非常重要的定理,那就是Arzela定理。

(7)以上都是从分析的角度来研究复分析,如果从几何的角度来说,最重要的定理莫过于 Riemann 映照定理 (Riemann Mapping Theorem)。这个时候一般会介绍线性变换,就是 Mobius 变换 (Mobius Transforms),把各种各样的单连通区域映射成单位圆。研究 Mobius 变换的保角和交比之类的性质。

(8)椭圆函数 (Elliptic Functions),经典的双周期函数 (Double Periodic Functions)。这里有 Weierstrass 理论,是研究 Weierstrass 函数的,有经典的微分方程,以及该函数的性质。 以上就是复分析或者复变函数的一些课程介绍,如果有遗漏或者疏忽的地方请大家指教。

推荐书籍:

(1)Complex Analysis,3rd Edition,Lars V.Ahlfors

ahlfors.jpg

(2)Complex Analysis,Elias M. Stein

stein.jpg

调和分析

之前在北京大学学了整整一个学期的调和分析,是由BICMR的苗老师主讲。在这门课上我受益匪浅,故写一篇文章来感激下这位老师,同时写一下自己学习调和分析的感受。

调和分析起源于 Fourier 这位数学家的研究,故也可以称为 Fourier 分析。其主要内容包括算子插值方法,Hardy-Littlewood 极大算子,Fourier变换,Calderon-Zygmund’s Inequality, 函数空间,Ap 权等等。下面一一介绍这些基本内容。

(1) 算子插值方法

里面主要有 Marcinkiewicz Interpolation Theorem 和 Riesz Thorin Interpolation Theorem两个定理,分别是用实变方法和复变方法证明的。这两个定理则是研究算子的 L^p 有界性的关键定理,是整个调和分析的基础。

(2) Hardy—Littlewood Maximal Operator

这个是一个相当重要的拟线性算子,利用 Vitali Covering Theorem 和 Marcinkiewicz Interpolation Theorem 可以证明该算子是 L^p 有界的。证明过程不超过10行,但是证明过程相当的漂亮。

(3) Fourier Transformation

调和分析的主要工具,这个工具不仅仅在调和分析上有用,在 PDE 和随机过程中,这也是一个相当重要的工具。它把一个物理空间上的函数,转换成频率空间上的函数,从而获得了很多很好的性质。

(4) Calderon-Zygmund’s Inequality

这个定理是调和分析的经典定理之一,是处理卷积型的奇异积分的。可以看成是 Minkowskii 不等式的推广。Zygmund 把定理的条件放的很弱,只需要加上 Hormander 条件就可以得到算子的 L^p 有界性。然后也可以考虑其条件的充要条件。

(5) 函数空间

调和分析里面提到的函数空间包括 Sobolev space, Lipschitz space, Hardy space, Besov space 等等。 其中 Sobolev space 在 PDE 上面用处广泛,其代表作就是 Adams 的 Sobolev Space. Besov Space 里面有一个插值定理,也相当的重要,差不多5页吧,当时苗老师让我们全部背下来,嘿嘿。另外, Hardy Space里面有一个相当重要的定理,就是所谓的 Duality of BMO and H^1 Space. 其证明过程大概有10页吧,是由 C.Fefferman 和 Elias.M.Stein 在上个世纪70年代给出的,方法太经典了,看完之后甚至会觉得自己没有必要学数学了。

(6) Ap weight

这个也是调和分析的分支之一,其中周民强老先生的书上有详细记载,就不一一阐述了。

以上的这些内容就是之前一个学期在北大学习所学到的东西,学了调和分析之后,基本上就不怕所谓的硬分析了。总之收获还是蛮多的,非常欣赏那位老师,一个学期讲了那么多东西。其实以上我提到的只是他讲的东西的一半内容,他后面还讲了很多 Schrodinger 方程的内容,由于本人实力有限,实在是没有能力再学后面的内容了。

ps:这是2009年的事情了,一晃眼7年过去了。

参考文献:
Loukas Grafakos GTM249 Classical Fourier Analysis
Loukas Grafakos GTM250 Modern Fourier Analysis
(上面这两本书是调和分析的经典之作,几乎涵盖了实变方法的所有内容。不过有点厚,差不多1100页。)

Ergodic Properties

One Dimensional Dynamics

— Welington De Melo, Sebastian van Strien

Chapter 5. Ergodic Properties and Invariant Measures.

1. Ergodicity, Attractors and Bowen-Ruelle-Sinai Measures.

A distortion result for unimodal maps with recurrence

Given a unimodal map f, we say that an interval U is symmetric if \tau(U)=U where \tau:[-1,1]\rightarrow [-1,1] is so that f(\tau(x))=f(x) and \tau(x)\neq x if x\neq c. Furthermore, for each symmetric interval U let

D_{U}=\{x: \text{ there exists } k>0 \text{ with } f^{k}(x)\in U\};

for x\in D_{U} let k(x,U) be the minimal positive integer with f^{k}(x)\in U and let

R_{U}(x)=f^{k(x,U)}(x).

We call R_{U}: D_{U}\rightarrow U the Poincare map or transfer map to U and k(x,U) the transfer time of x to U. The distortion result states that one can fined a sequence of symmetric neighbourhoods of the turning point such that the Poincare maps to these intervals have a distortion which is universally bounded:

Theorem 1.1.  Let f:[-1,1]\rightarrow [-1,1] be a unimodal map with one non-flat critical point with negative Schwarzian derivative and without attracting periodic points. Then there exists \rho>0 and a sequence os symmetric intervals U_{n}\subseteq V_{n} around the turning point which shrink to c such that V_{n} contains a \rho-scaled neighbourhood of U_{n} and such that the following properties hold.

1. The transfer time on each component of D_{U_{n}} is constant.

2. Let I_{n} be a component of the domain D_{U_{n}} of the transfer map to U_{n} which does not intersect U_{n}. Then there exists an interval T_{n}\supseteq I_{n} such that f^{k}|T_{n} is monotone, f^{k}(T_{n})\supseteq V_{n} and f^{k}(I_{n})=U_{n}. Here k is the transfer time on I_{n}, i.e., R_{U_{n}}|I_{n}=f^{k}.

Corollary. There exists K<\infty such that

1. for each component I_{n} of D_{U_{n}} not intersecting U_{n}, the transfer map R_{U_{n}} to U_{n} sends I_{n} diffeomorphically onto U_{n} and the distortion of R_{U_{n}} on I_{n} is bounded from above by K.

2. on each component I_{n} of D_{U_{n}} which is contained in U_{n}, the map R_{U_{n}}:I_{n}\rightarrow U_{n} can be written as (f^{k(n)-1}|f(I_{n}))\circ f|I_{n} where the distortion of f^{k(n)}|f(I_{n}) is universally bounded by K.

As before, we say that f is ergodic with respect to the Lebesgue measure if each completely invariant set X (Here X is called completely invariant if f^{-1}(X)=X) has either zero or full Lebesgue measure. An alternative way to define this notation of ergodicity goes as follows: f is ergodic if for each two forward invariant sets X and Y such that X\cap Y has Lebesgue measure zero, at most one of these sets has positive Lebesgue measure. (Here X is called forward invariant if f(X)\subseteq X.)

Theorem 1.2 (Blokh and Lyubich). Let f:[-1,1]\rightarrow [-1,1] be a unimodal map with a non-flat critical point with negative Schwarzian derivative and without an attracting periodic points. Then f is ergodic with respect to the Lebesgue measure.

Theorem 1.3.  Let f:[-1,1]\rightarrow [-1,1] be a unimodal map with a non-flat critical point with negative Schwarzian derivative. Then f has a unique attractor A, \omega(x)=A for almost all x and A either consists of intervals or has Lebesgue measure zero. Furthermore, one has the following:

1. if f has an attracting periodic orbit then A is this periodic orbit;

2. if f is infinitely often renormalizable then A is the attracting Cantor set \omega(c) (in which case it is called a solenoidal attractor);

3. f is only finitely often renormalizable then either

(a) A coincides with the union of the transitive intervals, or,

(b) A is a Cantor set and equal to \omega(c).

If \omega(c) is not a minimal set then f is as in case 3.a and each closed forward invariant set either contains intervals or has Lebesgue measure zero. Moreover, if \omega(c) does not contain intervals, then \omega(c) has Lebesgue measure zero.

Remark. Here a forward invariant set X is said to be minimal if the closure of the forward orbit of a point in X is always equal to X. The attractors in case 3.b is called a non-renormalizable attracting Cantor set, or absorbing Cantor attractor or wild Cantor attractor. Such an attractor really exists which is proven in [BKNS], and one has the following strange phenomenon: there exist many orbits which are dense in some finite union of intervals and yet almost all points tend to a minimal Cantor set of Lebesgue measure zero (this Cantor set is \omega(c)). The Fibonacci map is non-renormalizable and for which \omega(c) is a Cantor set. It was shown by Lyubich and Milnor that the quadratic map with this dynamics has no absorbing Cantor attractors. More generally, Jakobson and Swiatek proved that maps with negative Schwarzian derivative and which are close to the map f(x)=4x(1-x) do not have such Cantor attractors. Moreover, Lyubich has shown that these absorbing Cantor attractors can not exist if the critical point is quadratic. However, Bruin, Keller, Nowicki and Van Strien showed that the absorbing Cantor attractors exist for Fibonacci maps when the critical order \ell is sufficiently large enough.

Theorem (Lyubich). If f:[-1,1]\rightarrow [-1,1] is C^{3} unimodal, has a quadratic critical point, has negative Schwarzian derivative and has no periodic attractors, then each closed forward invariant set K which has positive Lebesgue measure contains an interval.

The next result, which is due to Martens (1990), shows that if these absorbing Cantor attractors do not exist then one has a lot of ‘expansion’. Let x not be in the pre orbit of c and define T_{n}(x) to be the maximal interval on which f^{n}|T_{n}(x) is monotone. Let R_{n}(x) and L_{n}(x) be the components of T_{n}\setminus x and define r_{n}(x) be the minimum of the length of f^{n}(R_{n}(x)) and f^{n}(L_{n}(x)).

Theorem 1.4 (Martens). Let f be a C^{3} unimodal map with negative Schwarian derivative whose critical point is non-flat. Then the following three properties are equivalent.

1. f has no absorbing Cantor attractor;

2. \limsup_{n\rightarrow \infty} r_{n}(x)>0 for almost all x;

3. there exist neighbourhoods U\subseteq V of c with cl(U)\subseteq int(V) such that for almost every x there exists a positive integer m and an interval neighbourhood T of x such that f^{m}|T is monotone, f^{m}(T)\supseteq V and f^{m}(x)\in U.

Fractals – A Very Short Introduction

Excerpt From: Falconer, Kenneth. “Fractals: A Very Short Introduction (Very Short Introductions).” iBooks.

Chapter 7
A little history

Geometry, with its highly visual and practical nature, is one of the oldest branches of mathematics. Its development through the ages has paralleled its increasingly sophisticated applications. Construction, crafts, and astronomy practised by ancient civilizations led to the need to record and analyse the shapes, sizes, and positions of objects. Notions of angles, areas, and volumes developed with the need for surveying and building. Two shapes were especially important: the straight line and the circle, which occurred naturally in many settings but also underlay the design of many artefacts. As well as fulfilling practical needs, philosophers were motivated by aesthetic aspects of geometry and sought simplicity in geometric structures and their applications. This reached its peak with the Greek School, notably with Plato (c 428–348 BC) and Euclid (c 325–265 BC), for whom constructions using a straight edge and compass, corresponding to line and circle, were the essence of geometric perfection.

As time progressed, ways were found to express and solve geometrical problems using algebra. A major advance was the introduction by René Descartes (1596–1650) of the Cartesian coordinate system which enabled shapes to be expressed concisely in terms of equations. This was a necessary precursor to the calculus, developed independently by Isaac Newton (1642–1727) and Gottfried Leibniz (1646–1714) in the late 17th century. The calculus provided a mathematical procedure for finding tangent lines that touched smooth curves as well as a method for computing areas and volumes of an enormous variety of geometrical objects. Alongside this, more sophisticated geometric figures were being observed in nature and explained mathematically. For example, using Tycho Brahe’s observations, Johannes Kepler proposed that planets moved around ellipses, and this was substantiated as a mathematical consequence of Newton’s laws of motion and gravitation.

The tools and methods were now available for tremendous advances in mathematics and the sciences. All manner of geometrical shapes could be analysed. Using the laws of motion together with the calculus, one could calculate the trajectories of projectiles, the motion of celestial bodies, and, using differential equations which developed from the calculus, more complex motions such as fluid flows. Although the calculus underlay Graph of a Brownian process8I to think of all these applications, its foundations remained intuitive rather than rigorous until the 19th century when a number of leading mathematicians including Augustin Cauchy (1789–1857), Bernhard Riemann (1826–66), and Karl Weierstrass (1815–97) formalized the notions of continuity and limits. In particular, they developed a precise definition for a curve to be ‘differentiable’, that is for there to be a tangent line touching the curve at a point. Many mathematicians worked on the assumption that all curves worthy of attention were nice and smooth so had tangents at all their points, enabling application of the calculus and its many consequences. It was a surprise when, in 1872, Karl Weierstrass constructed a ‘curve’ that was so irregular that at no point at all was it possible to draw a tangent line. The Weierstrass graph might be regarded as the first formally defined fractal, and indeed it has been shown to have fractal dimension greater than 1.

In 1883, the German Georg Cantor (1845–1918) wrote a paper introducing the middle-third Cantor set, obtained by repeatedly removing the middle thirds of intervals (see Figure 44). The Cantor set is perhaps the most basic self-similar fractal, made up of 2 scale copies of itself, although of more immediate interest to Cantor were its topological and set theoretic properties, such as it being totally disconnected, rather than its geometry. (Several other mathematicians studied sets of a similar form around the same time, including the Oxford mathematician Henry Smith (1826–83) in an article in 1874.) In 1904, Helge von Koch introduced his curve, as a simpler construction than Weierstrass’s example of a curve without any tangents. Then, in 1915, the Polish mathematician Wacław Sierpiński (1882–1969) introduced his triangle and, in 1916, the Sierpiński carpet. His main interest in the carpet was that it was a ‘universal’ set, in that it contains continuously deformed copies of all sets of ‘topological dimension’ 1. Although such objects have in recent years become the best-known fractals, at the time properties such as self-similarity were almost irrelevant, their main use being to provide specific examples or counter-examples in topology and calculus.

It was in 1918 that Felix Hausdorff proposed a natural way of ‘measuring’ the middle-third Cantor set and related sets, utilizing a general approach due to Constantin Carathéodory (1873–1950). Hausdorff showed that the middle-third Cantor set had dimension of log2/log3 = 0.631, and also found the dimensions of other self-similar sets. This was the first occurrence of an explicit notion of fractional dimension. Now termed ‘Hausdorff dimension’, his definition of dimension is the one most commonly used by mathematicians today. (Hausdorff, who did foundational work in several other areas of mathematics and philosophy, was a German Jew who tragically committed suicide in 1942 to avoid being sent to a concentration camp.) Box-dimension, which in many ways is rather simpler than Hausdorff dimension, appeared in a 1928 paper by Georges Bouligand (1889–1979), though the idea underlying an equivalent definition had been mentioned rather earlier by Hermann Minkowski (1864–1909), a Polish mathematician known especially for his work on relativity.

For many years, few mathematicians were very interested in fractional dimensions, with highly irregular sets continuing to be regarded as pathological curiosities. One notable exception was Abram Besicovitch (1891–1970), a Russian mathematician who held a professorship in Cambridge for many years. He, along with a few pupils, investigated the dimension of a range of fractals as well as investigating some of their geometric properties.

Excerpt From: Falconer, Kenneth. “Fractals: A Very Short Introduction (Very Short Introductions).” iBooks.

 

Perron-Frobenius Operator

Perron-Frobenius Operator

Consider a map f which possibly has a finite (or countable) number of discontinuities or points where possibly the derivative does not exist. We assume that there are points

\displaystyle q_{0}<q_{1}<\cdot\cdot\cdot <q_{k} or q_{0}<q_{1}<\cdot\cdot\cdot<q_{\infty}<\infty

such that f restricted to each open interval A_{j}=(q_{j-1},q_{j}) is C^{2}, with a bound on the first and the second derivatives. Assume that the interval [q_{0},q_{k}] ( or [q_{0},q_{\infty}] ) is positive invariant, so f(x)\in [q_{0},q_{k}] for all x\in [q_{0}, q_{k}] ( or f(x)\in [q_{0},q_{\infty}]  for all x\in[q_{0},q_{\infty}] ).

For such a map, we want a construction of a sequence of density functions that converge to a density function of an invariant measure. Starting with \rho_{0}(x)\equiv(q_{k}-q_{0})^{-1} ( or \rho_{0}(x)\equiv(q_{\infty}-q_{0})^{-1} ),assume that we have defined densities up to \rho_{n}(x), then define define \rho_{n+1}(x) as follows

\displaystyle \rho_{n+1}(x)=P(\rho_{n})(x)=\sum_{y\in f^{-1}(x)}\frac{\rho_{n}(y)}{|Df(y)|}.

This operator P, which takes one density function to another function, is called the Perron-Frobenius operator. The limit of the first n density functions converges to a density function \rho^{*}(x),

\displaystyle \rho^{*}(x)=\lim_{k\rightarrow \infty}\frac{1}{k}\sum_{n=0}^{k-1}\rho_{n}(x).

The construction guarantees that \rho^{*}(x) is the density function for an invariant measure \mu_{\rho^{*}}.

Example 1. Let

\displaystyle f(x)= \begin{cases}  x &\mbox{if } x\in(0,\frac{1}{2}), \\  2x &\mbox{if } x\in(\frac{1}{2},1).  \end{cases}

Screen Shot 2014-11-08 at 9.55.51 am

We construct the first few density functions by applying the Perron-Frobenius operator, which indicates the form of the invariant density function.
Take \rho_{0}(x)\equiv1 on [0,1]. From the definition of f(x), the slope on (0,\frac{1}{2}) and (\frac{1}{2},1) are 1 and 2, respectively. If x\in (\frac{1}{2},1), then it has only one pre-image on (\frac{1}{2},1); else if x\in(0,\frac{1}{2}), then it has two pre-images, one is x^{'} in (0,\frac{1}{2}), the other one is x^{''} in (\frac{1}{2},1). Therefore,

\rho_{1}(x)= \begin{cases}  \frac{1}{1}+\frac{1}{2} &\mbox{if } x\in(0,\frac{1}{2}), \\  \frac{1}{2} &\mbox{if } x\in(\frac{1}{2},1).  \end{cases}

By similar considerations,

\displaystyle \rho_{2}(x)=\begin{cases}1+\frac{1}{2}+\frac{1}{2^{2}} &\mbox{if } x\in(0,\frac{1}{2}), \\ \frac{1}{2^{2}} &\mbox{if } x\in(\frac{1}{2},1).\end{cases}

By induction, we get

\displaystyle \rho_{n}(x)=\begin{cases}1+\frac{1}{2}+\cdot\cdot\cdot+\frac{1}{2^{n}} &\mbox{if } x\in(0,\frac{1}{2}), \\ \frac{1}{2^{n}} &\mbox{if } x\in(\frac{1}{2},1).\end{cases}

Now, we begin to calculate the density function \rho^{*}(x). If x\in(0,\frac{1}{2}), then
\displaystyle  \rho^{*}(x)=\lim_{k\rightarrow \infty}\frac{1}{k}\sum_{n=0}^{k-1}\rho_{n}(x)  =\lim_{k\rightarrow \infty}\frac{1}{k}\sum_{n=0}^{k-1} \sum_{m=0}^{n}\frac{1}{2^{m}}  =\lim_{k\rightarrow \infty}\frac{1}{k}\sum_{n=0}^{k-1}\left(2-\frac{1}{2^{n}}\right)=2.
If x\in(\frac{1}{2},1), then
\displaystyle  \rho^{*}(x)=\lim_{k\rightarrow \infty}\frac{1}{k}\sum_{n=0}^{k-1}\rho_{n}(x)  =\lim_{k\rightarrow \infty}\frac{1}{k}\sum_{n=0}^{k-1}\frac{1}{2^{n}}  =\lim_{k\rightarrow \infty}\frac{1}{k}\left(2-\frac{1}{2^{k}}\right)=0.
i.e.

\displaystyle \rho^{*}(x)= \begin{cases}  2 &\mbox{if } x\in(0,\frac{1}{2}), \\  0 &\mbox{if } x\in(\frac{1}{2},1).  \end{cases}

Example 2. Let

\displaystyle f(x)=\begin{cases}  2x &\mbox{if } x\in(0,\frac{1}{2}), \\  2x-1 &\mbox{if } x\in(\frac{1}{2},1).  \end{cases}

Screen Shot 2014-11-08 at 9.56.12 am

Take \rho_{0}(x)\equiv1 on (0,1). By induction, \rho_{n}(x)\equiv1 on (0,1) for all n\geq 0. Therefore, \rho^{*}(x)\equiv1 on (0,1).

Example 3. Let

\displaystyle f(x)=\begin{cases}  x &\mbox{if } x\in(0,\frac{1}{2}), \\  2^{n+1}\cdot\left(x-\left(1-\frac{1}{2^{n}}\right)\right) &\mbox{if } x\in\left(1-\frac{1}{2^{n}},1-\frac{1}{2^{n+1}}\right) \text{ for all } n\geq 1.\end{cases}

Screen Shot 2014-11-08 at 9.56.31 am

Take \rho_{0}(x)\equiv1 on (0,1). Assume

\displaystyle \rho_{n}(x)= \begin{cases}  a_{n} &\mbox{if } x\in(0,\frac{1}{2}), \\  b_{n} &\mbox{if } x\in(\frac{1}{2},1).  \end{cases}

for all n\geq 0. It is obviously that a_{0}=b_{0}=1. By similar considerations,
\displaystyle \rho_{n+1}(x)= \begin{cases}  \frac{a_{n}}{1}+\frac{b_{n}}{4}+\frac{b_{n}}{8}+\frac{b_{n}}{16}+\cdot\cdot\cdot= a_{n}+\frac{b_{n}}{2} &\mbox{if } x\in(0,\frac{1}{2}), \\  \frac{b_{n}}{4}+\frac{b_{n}}{8}+\frac{b_{n}}{16}+\cdot\cdot\cdot = \frac{b_{n}}{2} &\mbox{if } x\in(\frac{1}{2},1).  \end{cases}
That means

\displaystyle \left( \begin{array}{ccc}  a_{n+1} \\  b_{n+1}  \end{array} \right)  =\left( \begin{array}{ccc}  a_{n}+\frac{1}{2}b_{n} \\  \frac{1}{2}b_{n}  \end{array} \right)  = \left( \begin{array}{ccc}  1 & \frac{1}{2} \\  0 & 1  \end{array} \right)  \left( \begin{array}{ccc}  a_{n} \\  b_{n}  \end{array} \right)

for all n\geq 0. From direct calculation, \displaystyle a_{n}=2-\frac{1}{2^{n}} and \displaystyle b_{n}=\frac{1}{2^{n}} for all n\geq 0. Therefore,

\displaystyle \rho^{*}(x)=\lim_{k\rightarrow \infty}\frac{1}{k}\sum_{n=0}^{k-1}\rho_{n}(x)=\begin{cases}  2 &\mbox{if } x\in (0,\frac{1}{2}), \\  0 &\mbox{if } x\in (\frac{1}{2},1).  \end{cases}

Example 4. Let

\displaystyle f(x)=\begin{cases}  1.5 x &\mbox{if } x\in(0,\frac{1}{2}), \\  2^{n+1}\cdot\left(x-\left(1-\frac{1}{2^{n}}\right)\right) &\mbox{if } x\in\left(1-\frac{1}{2^{n}},1-\frac{1}{2^{n+1}}\right) \text{ for all } n\geq 1.\end{cases}

Screen Shot 2014-11-08 at 9.56.38 am

Take \rho_{0}(x)\equiv1 on (0,1). Assume

\displaystyle \rho_{n}(x)= \begin{cases}  a_{n} &\mbox{if } x\in(0,\frac{3}{4}), \\  b_{n} &\mbox{if } x\in(\frac{3}{4},1).  \end{cases}

for all n\geq 0. It is obviously that a_{0}=b_{0}=1. By similar considerations,

\displaystyle \left( \begin{array}{ccc}  a_{n+1} \\  b_{n+1}  \end{array} \right)  =\left( \begin{array}{ccc}  \frac{11}{12}a_{n}+\frac{1}{4}b_{n} \\  \frac{1}{4}a_{n}+\frac{1}{4}b_{n}  \end{array} \right)  = \left( \begin{array}{ccc}  \frac{11}{12} & \frac{1}{4} \\  \frac{1}{4} & \frac{1}{4}  \end{array} \right)  \left( \begin{array}{ccc}  a_{n} \\  b_{n}  \end{array} \right)

for all n\geq 0. From matrix diagonalization , \displaystyle a_{n}=\frac{6}{5}-\frac{1}{5}\cdot\frac{1}{6^{n}} and \displaystyle b_{n}=\frac{2}{5}+\frac{3}{5}\cdot\frac{1}{6^{n}} for all n\geq 0.

Therefore,

\displaystyle \rho^{*}(x)=\lim_{k\rightarrow \infty}\frac{1}{k}\sum_{n=0}^{k-1}\rho_{n}(x)=\begin{cases}  \frac{6}{5} &\mbox{if } x\in (0,\frac{3}{4}), \\  \frac{2}{5} &\mbox{if } x\in (\frac{3}{4},1).  \end{cases}

Perron-Frobenius Theory

Definition. Let A=[a_{ij}] be a k\times k matrix. We say A is non-negative if a_{ij}\geq 0 for all i,j. Such a matrix is called irreducible if for any pair i,j there exists some n>0 such that a_{ij}^{(n)}>0 where a_{ij}^{(n)} is the (i,j)-th element of A^{n}. The matrix A is irreducible and aperiodic if there exists n>0 such that a_{ij}^{(n)}>0 for all i,j.

Perron-Frobenius Theorem Let A=[a_{ij}] be a non-negative k\times k matrix.

(i) There is a non-negative eigenvalue \lambda such that no eigenvalue of A has absolute value greater than \lambda.

(ii) We have \min_{i}(\sum_{j=1}^{k}a_{ij})\leq \lambda\leq \max_{i}(\sum_{j=1}^{k}a_{ij}).

(iii) Corresponding to the eigenvalue \lambda there is a non-negative left (row) eigenvector u=(u_{1},\cdot\cdot\cdot, u_{k}) and a non-negative right (column) eigenvector v=(v_{1},\cdot\cdot\cdot, v_{k})^{T}.

(iv) If A is irreducible then \lambda is a simple eigenvalue and the corresponding eigenvectors are strictly positive (i.e. u_{i}>0, v_{i}>0 all i).

(v) If A is irreducible then \lambda is the only eigenvalue of A with a non-negative eigenvector.

Theorem.
Let A be an irreducible and aperiodic non-negative matrix. Let u=(u_{1},\cdot\cdot\cdot, u_{k}) and v=(v_{1},\cdot\cdot\cdot, v_{k})^{T} be the strictly positive eigenvectors corresponding to the largest eigenvalue \lambda as in the previous theorem. Then for each pair i,j, \lim_{n\rightarrow \infty} \lambda^{-n}a_{ij}^{(n)}=u_{j}v_{i}.

Now, let us see previous examples, again. The matrix A is irreducible and aperiodic non-negative matrix, and \lambda=1 has the largest absolute value in the set of all eigenvalues of A. From Perron-Frobenius Theorem, u_{i}, v_{j}>0 for all pairs i,j. Then for each pari i,j,
\lim_{n\rightarrow \infty}a_{ij}^{(n)}=u_{j}v_{i}. That means \lim_{n\rightarrow \infty}A^{(n)} is a strictly positive k\times k matrix.

Markov Maps

Definition of Markov Maps. Let N be a compact interval. A C^{1} map f:N\rightarrow N is called Markov if there exists a finite or countable family I_{i} of disjoint open intervals in N such that

(a) N\setminus \cup_{i}I_{i} has Lebesgue measure zero and there exist C>0 and \gamma>0 such that for each n\in \mathbb{N} and each interval I such that f^{j}(I) is contained in one of the intervals I_{i} for each j=0,1,...,n one has

\displaystyle \left| \frac{Df^{n}(x)}{Df^{n}(y)}-1 \right| \leq C\cdot |f^{n}(x)-f^{n}(y)|^{\gamma} \text{ for all } x,y\in I;

(b) if f(I_{k})\cap I_{j}\neq \emptyset, then f(I_{k})\supseteq I_{j};

(c) there exists r>0 such that |f(I_{i})|\geq r for each i.

As usual, let \lambda be the Lebesgue measure on N. We may assume that \lambda is a probability measure, i.e., \lambda(N)=1. Usually, we will denote the Lebesgue measure of a Borel set A by |A|.

Theorem.  Let f:N\rightarrow N be a Markov map and let \cup_{i}I_{i} be corresponding partition. Then there exists a f-invariant probability measure \mu on the Borel sets of N which is absolutely continuous with respect to the Lebesgue measure \lambda. This measure satisfies the following properties:

(a) its density \frac{d\mu}{d\lambda} is uniformly bounded and Holder continuous. Moreover, for each i the density is either zero on I_{i} or uniformly bounded away from zero.

If for every i and j one has f^{n}(I_{j})\supseteq I_{i} for some n\geq 1 then

(b) the measure is unique and its density \frac{d\mu}{d\lambda} is strictly positive;

(c) f is exact with respect to \mu;

(d) \lim_{n\rightarrow \infty} |f^{-n}(A)|=\mu(A) for every Borel set A\subseteq N.

If f(I_{i})=N for each interval I_{i}, then

(e) the density of \mu is also uniformly bounded from below.

Notes on Shape of Inner Space

Shape of Inner Space

shing-tung_yau_nadis_s._the_shape_of_inner_space

String Theory and the Geometry of the Universe’s Hidden Dimensions

Shing-Tung YAU and Steve NADIS

Chapter 3: P.39

My personal involvement in this area began in 1969, during my first semester of graduate studies at Berkeley. I needed a book to read during Chrismas break. Rather than selecting Portnoy’s Complaint, The Godfather, The Love Machine, or The Andromeda Strain-four top-selling books of that year-I opted for a less popular title, Morse Theory, by the American mathematician John Milnor. I was especially intrigued by Milnor’s section on topology and curvature, which explored the notion that local curvature has a great influence on geometry and topology. This is a theme I’ve pursued ever since, because the local curvature of a surface is determined by taking the derivatives of that surface, which is another way of saying it is based on analysis. Studying how that curvature influences geometry, therefore, goes to the heart of geometric analysis.

Having no office, I practically lived in Berkeley’s math library in those days. Rumor has it that the first thing I did upon arriving in the United States was visiting that library, rather than, say, explore San Francisco as other might have done. While I can’t remember exactly what I did, forty years hence, I have no reason to doubt the veracity of that rumor. I wandered around the library, as was my habit, reading every journal I could get my hands on. In the course of rummaging through the reference section during winter break, I came across a 1968 article by Milnor, whose book I was still reading. That article, in turn, little else to do at the time (with most people away for the holiday), I tried to see if I could prove something related to Preissman’s theorem.

Chapter 4: P.80

From this sprang the work I’ve become most famous for. One might say it was my calling. No matter what our station, we’d all like to find our true calling in life-that special thing we were put on this earth to do. For an actor, it might be playing Stanley Kowalski in A Streetcar Named Desire. Or the lead role in Hamlet. For a firefighter, it could mean putting out a ten-alarm blaze. For a crime-fighter, it could mean capturing Public Enemy Number One. And in mathematics, it might come down to finding that one problem you’re destined to work on. Or maybe destiny has nothing to do with it. Maybe it’s just a question of finding a problem you can get lucky with.

To be perfectly honest, I never think about “destiny” when choosing a problem to work on, as I tend to be a bit more pragmatic. I try to seek out a new direction that could bring to light new mathematical problems, some of which might prove interesting in themselves. Or I might pick an existing problem that offers the hope that in the course of trying to understand it better, we will be led to a new horizon.

The Calabi conjecture, having been around a couple of decades, fell into the latter category. I latched on to this problem during my first year of graduate school, though sometimes it seemed as if the problem latched on to me. It caught my interest in a way that no other problem had before or has since, as I sensed that it could open a door to a new branch of mathematics. While the conjecture was vaguely related to Poincare’s classic problem, it struck me as more general because if Calabi’s hunch were true, it would lead to a large class of mathematical surfaces and spaces that we didn’t know anything about-and perhaps a new understanding of space-time. For me the conjecture was almost inescapable: Just about every road I pursued in my early investigations of curvature led to it.

Chapter 5: P.104

A mathematical proof is a bit like climbing a mountain. The first stage, of course, is discovering a mountain worth climbing. Imagine a remote wilderness area yet to be explored. It takes some wit just to find such an area, let alone to know whether something worthwhile might be found there. The mountaineer then devises a strategy for getting to the top-a plan that appears flawless, at least on paper. After acquiring the necessary tools and equipment, as well as mastering the necessary skills, the adventurer mounts an ascent, only to be stopped by unexpected difficulties. But others follow in their predecessor’s footsteps, using the successful strategies, while also pursuing different avenues-thereby reaching new heights in the process. Finally someone comes along who not only has a good plan of attack that avoids the pitfalls of the past but also has the fortitude and determination to reach the summit, perhaps planting a flag there to mark his or her presence. The risks to life and limb are not so great in math, and the adventure may not be so apparent to the outsider. And at the end of a long proof, the scholar does not plant a flag. He or she types in a period. Or a footnote. Or a technical appendix. Nevertheless, in our field there are thrill as well as perils to be had in the pursuit, and success still rewards those of us who’ve gained new views into nature’s hidden recesses.

Normal Families

Reference Book: Joel L.Schiff- Normal Families

Some Classical Theorems

Weierstrass Theorem Let \{ f_{n}\} be a sequence of analytic functions on a domain \Omega which converges uniformly on compact subsets of \Omega to a function f. Then f is analytic in \Omega, and the sequence of derivatives \{ f_{n}^{(k)}\} converges uniformly on compact subsets to f^{(k)}, k=1,2,3....

Hurwitz Theorem Let \{ f_{n}\} be a sequence of analytic functions on a domain \Omega which converges uniformly on compact subsets of \Omega to a non-constant analytic function f(z). If f(z_{0})=0 for some z_{0}\in\Omega, then for each r>0 sufficiently small, there exists an N=N(r), such that for all n>N, f_{n}(z) has the same number of zeros in D(z_{0},r) as does f(z). (The zeros are counted according to multiplicity).

The Maximum Principle If f(z) is analytic and non-constant in a region \Omega, then its absolute value |f(z)| has no maximum in \Omega.

The Maximum Principle’ If f(z) is defined and continuous on a closed bounded set E and analytic on the interior of E, then the maximum of |f(z)| on E is assumed on the boundary of E.

Corollary 1.4.1 If \{ f_{n}\} is a sequence of univalent analytic functions in a domain \Omega which converge uniformly on compact subsets of \Omega to a non-constant analytic function f, then f is univalent in \Omega.

Definition 1.5.1 A family of functions \mathcal{F} is locally bounded on a domain \Omega if, for each z_{0}\in \Omega, there is a positive number M=M(z_{0}) and a neighbourhood D(z_{0},r)\subset \Omega such that |f(z)|\leq M for all z\in D(z_{0}, r) and all f\in \mathcal{F}.

Theorem 1.5.2 If \mathcal{F} is a family of locally bounded analytic functions on a domain \Omega, then the family of derivatives \mathcal{F}^{'}=\{ f^{'}: f\in \mathcal{F}\} form a locally bounded family in \Omega.

The converse of Theorem 1.5.2 is false, since \mathcal{F}=\{n: n=1,2,3...\}. However, the following partial converse does hold.

Theorem 1.5.3 Let \mathcal{F} be a family of analytic functions on \Omega such that the family of derivatives \mathcal{F}^{'} is locally bounded and suppose that there is some z_{0}\in \Omega with |f(z_{0})|\leq M<\infty for all f\in \mathcal{F}. Then \mathcal{F} is locally bounded. (Hint: find a path connecting z_{0} and z.)

Definition 1.6.1 A family \mathcal{F} of functions defined on a domain \Omega is said to be equicontinuous (spherically continuous) at a point z^{'}\in \Omega if, for each \epsilon>0, there is a \delta=\delta(\epsilon,z^{'})>0 such that |f(z)-f(z^{'})|<\epsilon(\chi(f(z),f(z^{'}))<\epsilon) whenever |z-z^{'}|<\delta, for every f\in \mathcal{F}. Moreover, \mathcal{F} is equicontinuous (spherical continuous) on a subset E\subset \Omega if it is continuous (spherically continuous) at each point of E.

Normal Families of Analytic Functions

Definition 2.1.1  A familiy \mathcal{F} of  analytic functions on a domain \Omega\subset \mathbb{C} is normal in \Omega if every sequence of functions \{f_{n}\}\subset \mathcal{F} contains either a subsequence which converges to a limit function f\not\equiv \infty uniformly on each compact subset of \Omega, or a subsequence which converges uniformly to \infty on each compact subset.

The family \mathcal{F} is said to be normal at a point z_{0}\in\Omega if it is normal in some neighbourhood of z_{0}.

Theorem 2.1.2 A family of analytic functions \mathcal{F} is normal in a domain \Omega if and only if \mathcal{F} is normal at each point in \Omega.

Theorem 2.2.1 Arzela-Ascoli Theorem. If a sequence \{f_{n}\} of continuous functions converges uniformly on a compact set K to a limit function f\not\equiv \infty, then \{f_{n}\} is equicontinuous on K, and f is continuous. Conversely, if \{f_{n}\} is equicontinuous and locally bounded on \Omega, then a subsequence can be extracted from \{f_{n}\} which converges locally uniformly in \Omega to a (continuous) limit function f.

Montel’s Theorem If \mathcal{F} is a locally bounded family of analytic functions on a domain \Omega, then \mathcal{F} is a normal family in \Omega.

Koebe Distortion Theorem Let f(z) be analytic univalent in a domain \Omega and K a compact subset of \Omega. Then there exists a constant c=c(\Omega, K) such that for any z,w\in K, c^{-1}\leq |f^{'}(z)| / |f^{'}(w)| \leq c.

Vitali-Porter Theorem Let \{f_{n}\} be a locally bounded sequence of analytic functions in a domain \Omega such that \lim_{n\rightarrow \infty}f_{n}(z) exists for each z belonging to a set E\subset \Omega which has an accumulation point in \Omega. Then \{ f_{n}\} converges uniformly on compact subsets of \Omega to an analytic function.

Proof. From Montel’s Theorem, \{ f_{n}\} is normal, extract a subsequence \{ f_{n_{k}}\} which converges normally to an analytic function f. Then \lim_{k\rightarrow \infty} f_{n_{k}}(z)=f(z) for each z\in E.  Suppose, however, that \{ f_{n}\} does not converge uniformly on compact subsets of \Omega to f. Then there exists some \epsilon>0, a compact subset K\subset \Omega, as well as a subsequence \{f_{m_{j}}\} and points z_{j}\in K satisfying |f_{m_{j}}(z_{j})- f(z_{j})| \geq \epsilon, j=1,2,3,.... Now \{ f_{m_{j}}\} itself has a subsequence which converges uniformly on compact subsets to an analytic function g, and g\not\equiv f from above. However, since f and g must agree at all points of E, the Identity Theorem for analytic functions implies f\equiv g on \Omega, a contradiction which establishes the theorem.

Fundamental Normality Test Let \mathcal{F} be the family of analytic functions on a domain \Omega which omit two fixed values a and b in \mathbb{C}. Then \mathcal{F} is normal in \Omega.

Generalized Normality Test Suppose that \mathcal{F} is a family of analytic functions in a domain \Omega which omit a value a\in \mathbb{C} and such that no function of \mathcal{F} assumes the value b\in \mathbb{C} at more that p points. Then \mathcal{F} is normal in \Omega.

2.3 Examples:

Assume U is the unit disk in the complex plane, \Omega is a region (connected open set) in \mathbb{C}.

1. \mathcal{F}=\{ f_{n}(z)=z^{n}: n=1,2,3...\} in U. Then \mathcal{F} is normal in U, but not compact since 0 \notin \mathcal{F}. In the domain U^{'}: |z|>1, \mathcal{F} is normal.

2. \mathcal{F}=\{ f_{n}(z)=\frac{z}{n}: n=1,2,3...\} is a normal family in \mathcal{C} but not compact.

3. \mathcal{F}=\{ f: f analytic in \Omega  and |f|\leq M \}. Then \mathcal{F} is normal in \Omega and compact.

4. \mathcal{F}=\{ f: f analytic in \Omega and \Re f>0\}. Then \mathcal{F} is normal but not compact. Hint: \mathcal{G}=\{g=e^{-f}:f\in \mathcal{F}\} is a uniformly bounded family.

5. \mathcal{S}=\{ f: f analytic, univalent in U, f(0)=0, f^{'}(0)=1 \}. These are the normalised “Schlicht” functions in U. \mathcal{S} is normal and compact.

Normal Families of Meromorphic Functions

Assume a function f(z) is analytic in a neighbourhood of a, except perhaps at a itself. In other words, f(z) shall be analytic in a region 0<|z-a|<\delta. The point a is called an isolated singularity of f(z). There are three cases about an isolated singularity. The first one is a removable singularity, the second one is a pole, the third one is an essential singularity.  A function f(z) which is analytic in a region \Omega, except for poles, is said to be meromorphic in \Omega.

The chordal distance \chi(z_{1}, z_{2}) between z_{1} and z_{2} is

\chi(z_{1}, z_{2}) = \frac{|z_{1}-z_{2}|}{\sqrt{1+|z_{1}|^{2}}\sqrt{1+|z_{2}|^{2}}} if z_{1} and z_{2} are in the finite plane, and

\chi(z_{1}, \infty) = \frac{1}{\sqrt{1+|z_{1}|^{2}}}, if z_{2}=\infty. Clearly, \chi(z_{1}, z_{2})\leq 1, and \chi(z_{1}^{-1}, z_{2}^{-1}) = \chi(z_{1}, z_{2}). The chordal metric and spherical metric are uniformly equivalent and generate the same open sets on the Riemann sphere.

Definition 1.2.1 A sequence of functions \{ f_{n}\} converges spherically uniformly to f on a set E\subset \mathbb{C} if, for any \epsilon>0, there is a number n_{0} such that n\geq n_{0} implies \chi(f(z), f_{n}(z))<\epsilon, for all z\in E.

Definition 3.1.1 A family \mathcal{F} of meromorphic functions in a domain \Omega is normal in \Omega if every sequence \{ f_{n} \} \subset \mathcal{F} contains a subsequence which converges spherically uniformly on compact subsets of \Omega.

Theorem 3.1.3 Let \{ f_{n}\} be a sequence of meromorphic functions on a domain \Omega. Then \{ f_{n}\} converges spherically uniformly on compact subsets of \Omega to f if and only if about each point z_{0}\in \Omega there is a closed disk K(z_{0},r) in which |f_{n}-f|\rightarrow 0 or |1/f_{n} - 1/f| \rightarrow 0 uniformly as n\rightarrow \infty.

Corollary 3.1.4 Let \{ f_{n}\} be a sequence of meromorphic functions on \Omega which converges spherically uniformly on compact subsets to f. Then f is either a meromorphic function on \Omega or identically equal to \infty.

Corollary 3.1.5  Let \{ f_{n}\} be a sequence of analytic functions on a domain \Omega which converge spherically uniformly on compact subsets of \Omega to f. Then f is either analytic on \Omega or identically equal to \infty.

Theorem 3.2.1 A family \mathcal{F} of meromorphic functions in a domain \Omega is normal if and only if \mathcal{F} is spherically equicontinuous in \Omega.

Fundamental Normality Test Let \mathcal{F} be a family of meromorphic functions on a domain \Omega which omit three distinct values a, b, c \in \mathbb{C}. Then \mathcal{F} is normal in \Omega.

Vitali-Porter Theorem Let \{f_{n}\} be a sequence belonging to a spherically equicontinuous family of meromorphic functions such that \{ f_{n}(z)\} converges spherically on a point set E having an accumulation point in \Omega. Then \{ f_{n}\} converges spherically uniformly on compact subsets of \Omega.

Let f(z) be meromorphic on a domain \Omega. If z\in \Omega is not a pole, the derivative in the spherical metric, called the spherical derivative, is given by f^{\#}(z) =\lim_{z^{'}\rightarrow z}\frac{\chi(f(z),f(z^{'}))}{|z-z^{'}|} =\frac{|f^{'}(z) |}{1+|f(z)|^{2}}. If \zeta is a pole of f(z), define f^{\#}(\zeta) = \lim_{z\rightarrow \zeta} \frac{|f^{'}(z)|}{1+|f(z)|^{2}} .

Marty’s Theorem A family \mathcal{F} of meromorphic functions on a domain \Omega is normal if and only if for each compact subset K\subset \Omega, there exists a constant C=C(K) such that spherical derivative f^{\#}(z) =\frac{|f^{'}(z) |}{1+|f(z)|^{2}}\leq C, z\in K, f\in \mathcal{F}, that is, f^{\#} is locally bounded.

[转载]痛批计算数学所

发信人: rodm48gmf (—>—>—>—>—>—>—>—>—>—),

信区: D_Maths

标 题: 老板痛批计算数学所 (转)

发信站: 南京大学小百合站 (Sat Oct 11 15:59:23 2014)

昨天讨论班上,一位师兄就博士论文向老板咨询。老板语重心长的说:现在好歹也是博士了,论文里必须要有些自己的东西,能拿别人的东西拼凑!接着话锋一转,说道:最近中国的大飞机搞得很热闹,发动机是别人的不算,连自己测试出来的飞机模型数据因为没 有算法也不会算,只好花了6000万美金问老美买软件。但美方条件相当苛刻,要求中方把初始数据的备份拷给美方,等美方分析完了,再把最终结果告诉你,中间过程没法看到。 这下可好,不但算法核心没法掌握,连同我们的飞机性能也让人家了如指掌了,不用侦查卫星,就把你查个底朝天。

老板接着说:回头看我们的计算数学所,近几年来对数学理论本身的要求越来越弱化。招的学生在本科就学些计算数学专业的数值分析,数学软件。到了研究生阶段,只看见天天 泡在电脑前面敲键盘、调试程序,写出的算法无非是对已有的东西小打小闹,根本没有理论深度。也不是他不花功夫,是实在是层次太低,别说微分几何、代数拓扑这些常规的东西都不懂,即使是本科的数学物理方程,真正学好的人也没几个。对偏微分方程的认识皮 毛也谈不上,你说他怎么写的出好的算法。

老板还说,若把那些大飞机、卫星项目让企业来做,就更加不行了。那些工程师自从高校中出去以后,就开始吃老本,天下算法一大抄,有的甚至为了拟合精度而篡改实验数据,这样造出来的飞机、卫星能不掉下来吗?当然为了混口饭吃,这么做也并不难理解。

最后,老板说道:反观数学所,我们也没必要高兴到哪里去。可能我们这儿招的做PDE的学 生,数学物理方程还没计算所的好呢。现在国家急了,对重大计算项目特别重视,基金委拨了2亿元立项,十二五期间务必在这方面有重大突破。所以啊,不管怎么说,大家做东西 一定得有自己的想法,一步步把结果做上去!现在无论是做大飞机、还是搞卫星,都需要研究人员不仅会编写程序,还要懂得其中深刻的数学原理,可能它的理论难点就涉及某些奇点理论,这就需要大家懂微分几何、懂流形拓扑、懂奇点理论。物理背景的重要性是毋庸置疑的,量子理论作为物理的两大基石之一,处处发挥重要作用。如果我们的研究人员都具备了这样的素质,我们的科研才有希望。而中国的计算数学差就差在两点:一、没有 数学理论做依托,只会微积分和矩阵论。二、系统集成能力太差,编出Windows之类的操作系统根本不可能!

※ 来源:.南京大学小百合站 http://bbs.nju.edu.cn

[FROM: 114.212.206.39]

2014 International Congress of Mathematics: Awards

Fields Medalist:

Artur Avila

p_1

CNRS, France & IMPA, Brazil

[Artur Avila is awarded a Fields Medal] for his profound contributions to dynamical systems theory have changed the face of the field, using the powerful idea of renormalization as a unifying principle.

Avila leads and shapes the field of dynamical systems. With his collaborators, he has made essential progress in many areas, including real and complex one-dimensional dynamics, spectral theory of the one-frequency Schródinger operator, flat billiards and partially hyperbolic dynamics.

Avila’s work on real one-dimensional dynamics brought completion to the subject, with full understanding of the probabilistic point of view, accompanied by a complete renormalization theory. His work in complex dynamics led to a thorough understanding of the fractal geometry of Feigenbaum Julia sets.

In the spectral theory of one-frequency difference Schródinger operators, Avila came up with a global de- scription of the phase transitions between discrete and absolutely continuous spectra, establishing surprising stratified analyticity of the Lyapunov exponent.

In the theory of flat billiards, Avila proved several long-standing conjectures on the ergodic behavior of interval-exchange maps. He made deep advances in our understanding of the stable ergodicity of typical partially hyperbolic systems.

Avila’s collaborative approach is an inspiration for a new generation of mathematicians.

 

Manjul Bhargava

p_2

Princeton University, USA

[Manjul Bhargava is awarded a Fields Medal] for developing powerful new methods in the geometry of numbers and applied them to count rings of small rank and to bound the average rank of elliptic curves.

Bhargava’s thesis provided a reformulation of Gauss’s law for the composition of two binary quadratic forms. He showed that the orbits of the group SL(2, Z)3 on the tensor product of three copies of the standard integral representation correspond to quadratic rings (rings of rank 2 over Z) together with three ideal classes whose product is trivial. This recovers Gauss’s composition law in an original and computationally effective manner. He then studied orbits in more complicated integral representations, which correspond to cubic, quartic, and quintic rings, and counted the number of such rings with bounded discriminant.

Bhargava next turned to the study of representations with a polynomial ring of invariants. The simplest such representation is given by the action of PGL(2, Z) on the space of binary quartic forms. This has two independent invariants, which are related to the moduli of elliptic curves. Together with his student Arul Shankar, Bhargava used delicate estimates on the number of integral orbits of bounded height to bound the average rank of elliptic curves. Generalizing these methods to curves of higher genus, he recently showed that most hyperelliptic curves of genus at least two have no rational points.

Bhargava’s work is based both on a deep understanding of the representations of arithmetic groups and a unique blend of algebraic and analytic expertise.

 

Martin Hairer

p_3

University of Warwick, UK

[Martin Hairer is awarded a Fields Medal] for his outstanding contributions to the theory of stochastic partial differential equations, and in particular created a theory of regularity structures for such equations.

A mathematical  problem that  is important  throughout science is to understand the influence of noise on differential equations, and on the long time behavior of the solutions. This problem was solved for ordinary differential equations by Itó in the 1940s. For partial differential equations, a comprehensive theory has proved to be more elusive, and only particular cases (linear equations, tame nonlinearities, etc.)  had been treated satisfactorily.

Hairer’s work addresses two central aspects of the theory.  Together with Mattingly  he employed the Malliavin calculus along with new methods to establish the ergodicity of the two-dimensional stochastic Navier-Stokes equation.

Building  on the rough-path approach of Lyons for stochastic ordinary differential equations, Hairer then created an abstract theory of regularity structures for stochastic partial differential equations (SPDEs). This allows Taylor-like expansions around any point in space and time. The new theory allowed him to construct systematically solutions to singular non-linear SPDEs  as fixed points of a renormalization procedure.

Hairer was thus able to give, for the first time, a rigorous intrinsic meaning to many SPDEs arising in physics.

 

Maryam Mirzakhani

p_4

Stanford University, USA

[Maryam Mirzakhani is awarded the Fields Medal] for her outstanding contributions to the dynamics and geometry of Riemann surfaces and their moduli spaces.

Maryam Mirzakhani has made stunning advances in the theory of Riemann surfaces and their moduli spaces, and led the way to new frontiers in this area. Her insights have integrated methods from diverse fields, such as algebraic geometry, topology and probability theory.

In hyperbolic geometry, Mirzakhani established asymptotic formulas and statistics for the number of simple closed geodesics on a Riemann surface of genus g. She next used these results to give a new and completely unexpected proof of Witten’s conjecture, a formula for characteristic classes for the moduli spaces of Riemann surfaces with marked points.

In dynamics, she found a remarkable new construction that bridges the holomorphic and symplectic aspects of moduli space, and used it to show that Thurston’s earthquake flow is ergodic and mixing.

Most recently, in the complex realm, Mirzakhani and her coworkers produced the long sought-after proof of the conjecture that – while the closure of a real geodesic in moduli space can be a fractal cobweb, defying classification – the closure of a complex geodesic is always an algebraic subvariety.

Her work has revealed that the rigidity theory of homogeneous spaces (developed by Margulis, Ratner and others) has a definite resonance in the highly inhomogeneous, but equally fundamental realm of moduli spaces, where many developments are still unfolding

 

Nevanlinna Prize Winner:

Subhash Khot

p_5

New York University, USA 

[Subhash Khot is awarded the Nevanlinna Prize] for his prescient  definition of the “Unique Games” problem, and his leadership in the effort to understand its complexity and its pivotal role in the study of efficient approximation of optimization problems, have produced breakthroughs in algorithmic design and approximation hardness, and new exciting interactions between computational complexity, analysis and geometry.

Subhash Khot defined the “Unique Games” in 2002 , and subsequently led the effort to understand its complexity and its pivotal role in the study of optimization problems. Khot and his collaborators demonstrated that the hardness of Unique Games implies a precise characterization of the best approximation factors achievable for a variety of NP-hard optimization problems. This discovery turned the Unique Games problem into a major open problem of the theory of computation.

The ongoing quest to study its complexity has had unexpected benefits. First, the reductions used in the above results identified new problems in analysis and geometry, invigorating analysis of Boolean functions, a field at the interface of mathematics and computer science. This led to new central limit theorems, invariance principles, isoperimetric inequalities, and inverse theorems, impacting research in computational complexity, pseudorandomness, learning and combinatorics. Second, Khot and his collaborators used intuitions stemming from their study of Unique Games to yield new lower bounds on the distortion incurred when embedding one metric space into another, as well as constructions of hard families of instances for common linear and semi- definite programming algorithms. This has inspired new work in algorithm design extending these methods, greatly enriching the theory of algorithms and its applications.

 

Gauss Prize Winner:

Stanley Osher

p_6

University of Califonia, USA 

[Stanley Osher is awarded the Gauss Prize] for his influential contributions to several fields in applied mathematics, and his far-ranging inventions have changed our conception of physical, perceptual, and mathematical concepts, giving us new tools to apprehend the world.

1. Stanley Osher has made influential contributions in a broad variety of fields in applied mathematics. These include high resolution shock capturing methods for hyperbolic equations, level set methods, PDE based methods in computer vision and image processing, and optimization. His numerical analysis contributions, including the Engquist-Osher scheme, TVD schemes, entropy conditions, ENO and WENO schemes and numerical schemes for Hamilton-Jacobi type equations have revolutionized the field. His level set contribu- tions include new level set calculus, novel numerical techniques, fluids and materials modeling, variational approaches, high codimension motion analysis, geometric optics, and the computation of discontinuous so- lutions to Hamilton-Jacobi equations; level set methods have been extremely influential in computer vision, image processing, and computer graphics. In addition, such new methods have motivated some of the most fundamental studies in the theory of PDEs in recent years, completing the picture of applied mathematics inspiring pure mathematics.

2. Stanley Osher has unique mentoring qualities: he has influenced the education of generations of outstanding applied mathematicians, and thanks to his entrepreneurship he has successfully brought his mathematics to industry.

Trained as an applied mathematician and an applied mathematician all his life, Osher continues to surprise the mathematical and numerical community with the invention of simple and clever schemes and formulas. His far-ranging inventions have changed our conception of physical, perceptual, and mathematical concepts, and have given us new tools to apprehend the world.

 

Chern Medalist:

Phillip Griffiths

p_7

Institute for Advanced Study, USA 

[Phillip Griths is awarded the 2014 Chern Medal] for his groundbreaking and transformative development of transcendental methods in complex geometry, particularly his seminal work in Hodge theory and periods of algebraic varieties.

Phillip Griffiths’s ongoing work in algebraic geometry, differential geometry, and differential equations has stimulated a wide range of advances in mathematics over the past 50 years and continues to influence and inspire an enormous body of research activity today.

He has brought to bear both classical techniques and strikingly original ideas on a variety of problems in real and complex geometry and laid out a program of applications to period mappings and domains, algebraic cycles, Nevanlinna theory, Brill-Noether theory, and topology of K¨ahler manifolds.

A characteristic of Griffithss work is that, while it often has a specific problem in view, it has served in multiple instances to open up an entire area to research.

Early on, he made connections between deformation theory and Hodge theory through infinitesimal methods, which led to his discovery of what are now known as the Griffiths infinitesimal period relations. These methods provided the motivation for the Griffiths intermediate Jacobian, which solved the problem of showing algebraic equivalence and homological equivalence of algebraic cycles are distinct. His work with C.H. Clemens on the non-rationality of the cubic threefold became a model for many further applications of transcendental methods to the study of algebraic varieties.

His wide-ranging investigations brought many new techniques to bear on these problems and led to insights and progress in many other areas of geometry that, at first glance, seem far removed from complex geometry. His related investigations into overdetermined systems of differential equations led a revitalization of this subject in the 1980s in the form of exterior differential systems, and he applied this to deep problems in modern differential geometry: Rigidity of isometric embeddings in the overdetermined case and local existence of smooth solutions in the determined case in dimension 3, drawing on deep results in hyperbolic PDEs(in collaborations with Berger, Bryant and Yang), as well as geometric formulations of integrability in the calculus of variations and in the geometry of Lax pairs and treatises on the geometry of conservation laws and variational problems in elliptic, hyperbolic and parabolic PDEs and exterior differential systems.

All of these areas, and many others in algebraic geometry, including web geometry, integrable systems, and Riemann surfaces, are currently seeing important developments that were stimulated by his work.

His teaching career and research leadership has inspired an astounding number of mathematicians who have gone on to stellar careers, both in mathematics and other disciplines. He has been generous with his time, writing many classic expository papers and books, such as “Principles of Algebraic Geometry”, with Joseph Harris, that have inspired students of the subject since the 1960s.

Griffiths has also extensively supported mathematics at the level of research and education through service on and chairmanship of numerous national and international committees and boards committees and boards. In addition to his research career, he served 8 years as Duke’s Provost and 12 years as the Director of the Institute for Advanced Study, and he currently chairs the Science Initiative Group, which assists the development of mathematical training centers in the developing world.

His legacy of research and service to both the mathematics community and the wider scientific world continues to be an inspiration to mathematicians world-wide, enriching our subject and advancing the discipline in manifold ways.

 

Leelavati Prize Winner:

Adrián Paenza

p_8

University of Buenos Aires, Argentina 

[Adrian Paenza is awarded the Leelavati Prize] for his contributions have definitively changed the mind of a whole country about the way it perceives mathematics in daily life. He accomplished this through his books, his TV programs, and his unique gift of enthusiasm and passion in communicating the beauty and joy of mathematics.

Adrián Paenza has been the host of the long-running weekly TV program “Cient´ıficos Industria Argentina” (“Scientists Made in Argentina”), currently in its twelfth consecutive season in an open TV channel. Within a beautiful and attractive interface, each program consists of interviews with mathematicians and scientists of very different disciplines, and ends with a mathematical problem, the solution of which is given in the next program.

He has also been the host of the TV program “Alterados por Pi” (“Altered by Pi”), a weekly half-hour show exclusively dedicated to the popularization of mathematics; this show is recorded in front of a live audience in several public schools around the country.

Since 2005, he has written a weekly column about general science, but mainly about mathematics, on the back page of P´agina 12, one of Argentinas three national newspapers. His articles include historical notes, teasers and even proofs of theorems.

He has written eight books dedicated to the popularization of mathematics: five under the name “Matem´atica

. . . ¿est´as ah´ı?” (“Math . . . are you there?”), published by Siglo XXI Editores, which have sold over a million copies. The first of the series, published in September 2005, headed the lists of best sellers for a record of 73 consecutive weeks, and is now in its 22nd edition. The enormous impact and influence of these books has extended throughout Latin America and Spain; they have also been published in Portugal, Italy, the Czech Republic, and Germany; an upcoming edition has been recently translated also into Chinese.

奇妙的动力系统和分形几何

动力系统,听起来像工程上面的发动机,但是它却是数学的一个分支,主要研究的就是在一个固定的规则下,一个点在时间的变化下在空间中的位置变化。比方说钟的摆动,动物种族的繁衍,全球的气候变动。这类的模型都属于动力系统。这篇文章要介绍的,是动力系统与分形几何的一些奇妙性质。

失之毫厘,差之千里

Screen Shot 2014-08-01 at 9.16.26 pm

1961年,作为天气预报员的Lorenz在利用计算机来做气象预测时,为了省事,就在第二次计算的时候,直接从第一次程序的中间开始运算。但是两次的预测结果产生了巨大的差异。Lorenz看到这个结果之后大为震惊,然后经过不断地测试,发觉在自己的模型当中,只要初始的数据不一样,就会产生不同的结果,而且结果大相径庭。在1979年的科学会议上,Lorenz简单的描述了“蝴蝶效应”:

一只蝴蝶在巴西轻拍翅膀,会使更多蝴蝶跟着一起轻拍翅膀。最后将有数千只的蝴蝶都跟着那只蝴蝶一同振翅,其所产生的巨风可以导致一个月后在美国德州发生一场龙卷风。                                                                                        —–Edward Norton Lorenz

Screen Shot 2014-08-01 at 9.15.29 pm

在实际的天期预测中,影响天气变化的因素数不胜数,用成千上万的变量来预测天气都不为过。科学家在研究问题的时候,就需要把一个非常复杂的问题简单化,但是又必须保证不能过于简单化。于是Lorenz在1963年发表的文章“Deterministic Nonperiodic Flow”里面提出了一个只有三个变量x, y, z的模型:

x^{'}(t) =\sigma(y-x)

y^{'}(t)=x(\rho-z)-y

z^{'}(t)=xy-\beta z

这个模型中,x,y,z是系统的状态,t是时间。\rho, \sigma, \beta是这个系统的参数。

A_Trajectory_Through_Phase_Space_in_a_Lorenz_Attractor

这个模型肯定不能够精确地描述天气变化,但是对于Lorenz解释他自己的观点恰到好处。在这个模型中,变量之间有着非线性的关系,虽然只有三个变量,但是随着时间的推移,三个变量就会交织在一起,互相影响。在三维的空间里面作图的时候,随着时间的推移,系统的演变就会趋近于一个混沌的区域,就像几根线缠绕在两个图钉周围,形如一只正在拍打翅膀的蝴蝶。这个或许就是Lorenz把这现象叫做”蝴蝶效应“的原因。在Lorenz原创性的论文里面,他一针见血地指出天气的影响因素是复杂多变的,即使有了精确地模型,也没有办法做长期的预测,只能够在观测中不停地一边预计,一边修正。

Screen Shot 2014-08-01 at 9.06.57 pm

从数学的角度来看蝴蝶效应,就是在一个给定的动力系统下,一个微小的初值变动就能够带动整个系统长期而巨大的连锁反应,是一种混沌的现象。从社会学的方面来说明就是一个坏的机制,如果不加以及时地引导、调节,会给社会带来非常大的危害,戏称为”龙卷风“或“风暴”;一个好的微小的机制,只要正确指引,经过一段时间的努力,将会产生轰动效应,或称为“革命”。从心理学的角度来诠释就是表面上看起来毫无影响,非常微小的一件事情,都会带来巨大的变化。当一个人小时候的心理受到微小的心理刺激,在这个人的成长过程中,这个刺激就会被逐渐的放大并且不停地影响一个人的生活,牵一发而动全身。混沌理论改变了人们对于科学的看法,简单的数学背后隐藏的可能性远远比人们想象的多得多。

分形几何学:复杂简单化

从欧几里德写几何原本开始,大家就习惯于研究非常规则的形状,比如圆形,方形,正多面体。欧几里德的几何原本就向大家展示了这些规则图形的几何美感。但是在研究了规则的图形之后,对于不规则的图形该怎么去研究就成为了困扰大家的一个问题。譬如河流的支流,树枝的形状,蜿蜒的海岸线,这些都不是规则的图形,甚至都不知道该怎么样去描述这些形状。但是通过观察就会发现它们都具有一个非常有趣的性质,那就是自相似性。比如树枝,从底部看上去,会有分岔,甚至越分越多。如果从某一个枝节往上看,它仍然是一颗树枝,形状跟原来的几乎没有什么分别。就是说在越来越小的尺寸下,不停的复制之前的形状。那么很自然的一个问题就是:怎么样在数学上去构造这些自相似的图形,而不是通过刻意的人工去生成这些图案?通常的思维习惯就是复杂必须来源于复杂,复杂不可能来源于简单,经验告诉我们复杂的东西必须守恒化。

Screen Shot 2014-08-01 at 9.01.34 pm

但是数学工作者Mandelbrot在科研中却发现了一个简单得不能够再简单的规则去生成这种复杂的图形,那就是

z \mapsto z^{2}+c

在这个简单的规则下,z会变成z^{2}+c,然后z^{2}+c作为新的自变量z,再次进行这个运算,长此下去,就将形成著名的Mandelbrot集合。前两张图片就是在上面规则下形成的完整的Mandelbrot集合。那么我们将它放大六倍,放大之后看到的形状跟第一幅图片惊人的相似,继续放大100倍,2000倍,依旧不会改变它的这个性质。

Animation_of_the_growth_of_the_Mandelbrot_set_as_you_iterate_towards_infinity1024px-Mandel_zoom_00_mandelbrot_set

Mandelbrot-similar-x1 Mandelbrot-similar-x6 Mandelbrot-similar-x100 Mandelbrot-similar-x2000

在Mandelbrot集合里面,无论被放得有多大,都会看到跟原来图形相似的形状。这样的结果就告诉大家复杂不一定来源于复杂,说不定它背后的机制非常的简单,那就是说,同一个事情,可能既是复杂的,又是简单的,这样就要重新去考虑简单和复杂之间的关系。后来有人为了纪念Benoît B. Mandelbrot创立了分形几何中的自相似性,写了一句话:Benoît B. Mandelbrot里面的字母B就代表了Benoît B. Mandelbrot。

除了有z \rightarrow z^{2}+c产生的Mandelbrot集合,还有一些经典的分形结构。比如说Cantor集合。Cantor集合是不断的从一个区间[0,1]取走中间一段获得的集合。首先去掉(\frac{1}{3}, \frac{2}{3}),剩下[0,\frac{1}{3}] \cup [\frac{2}{3}, 1]。然后把剩下两条线段的中间都去掉,剩下[0,\frac{1}{9}] \cup [\frac{2}{9}, \frac{1}{3}] \cup [\frac{2}{3}, \frac{7}{9}] \cup [\frac{8}{9}, 1]。不停的重复这个过程,最后剩下的集合就是Cantor集合。在数学中,Cantor集合是无穷无尽的,甚至是不可数的,但是却是不占据任何空间的,因为它的长度是零。下图简单的描述了Cantor集合的形成过程。

729px-Cantor_set_in_seven_iterations.svg

利用类似的想法,就可以构造出Sierpinski三角形和Siepinski地毯。

Animated_construction_of_Sierpinski_TriangleAnimated_Sierpinski_carpet

路漫漫其修远兮

这些也许就是动力系统的本质,在及其简单的规则背后,随着时间的不断推移,就能够创造出令人惊叹不已的复杂系统,就像河流支流的形成和Mandelbrot集合,就像天气以及动物的种族变化。这种规则的制定并不需要一个碍手碍脚的设计师,它就像是宇宙与生俱来的本事。人们对这种复杂的事物感到不自然的原因,可能就是在这种情况下不需要一个创造者。在给定初始条件的情况下,在给定的自然规则下,宇宙就可以自由的发展。而这个发展存在于自然科学和社会科学中,甚至生活中的方方面面。宇宙的复杂性以及美丽,都来源于简单的规则和不断的重复,规则虽然简单,但是过程却十分复杂,并且结果是不可预知的。即使有了科学的确定性,仍然也无法知道未来的样子。

 

 

2. 刚性定理

考虑二次多项式 f_{a}(x)=ax(1-x), a\in[0,4], f_{a}:[0,1]\rightarrow [0,1].

问题:

\{ a\in[0,4]: f_{a} \text{ satisfies Axiom A} \} 是否在 [0,4] 中稠密?

引理:f_{a} 满足 Axiom A \Leftrightarrow f_{a} 有双曲吸引周期轨。

定义 Kneading 序列: K(f_{a})=\{ i_{1}, i_{2},... \}, i_{k}=L \text{ if } f_{a}^{k}(\frac{1}{2})<\frac{1}{2};  i_{k}=c=\frac{1}{2} \text{ if } f_{a}^{k}(\frac{1}{2})=\frac{1}{2};  i_{k}=R \text{ if } f_{a}^{k}(\frac{1}{2})>\frac{1}{2}.

例子:

K(f_{4})=(R,L,L,L,...)=RLLL,

K(f_{1})=(L,L,L,L,...)=LLLL,

K(f_{2})=(c,c,c,c,...)=cccc,

K(f_{1.9})=(L,L,L,L,...)=LLLL,

在这里,f_{1}f_{1.9} 不是拓扑共轭的,即使它们的 Kneading 序列是一样的。

定义:f 和 g 称为拓扑共轭,如果存在同胚映射 h 使得 h\circ f= g \circ h.

性质1: 如果 f_{a_{1}} f_{a_{2}} 拓扑同胚,则有 K(f_{a_{1}})= K(f_{a_{2}}).

引理:如果 f_{a_{1}} f_{a_{2}} 没有双曲吸引或者双曲中性周期轨,则 K(f_{a_{1}})= K(f_{a_{2}}) \Rightarrow  f_{a_{1}} f_{a_{2}} 拓扑同胚 \Rightarrow a_{1}=a_{2}.

定义:拟共形映射的分析定义: \varphi: \Omega\rightarrow \tilde{\Omega}, 在这里 \Omega, \tilde{\Omega} 都是复平面上面的连通开集, \varphi 是保持定向的同胚映射,称 \varphi 是 K 拟共形映射 (K\geq 1), 如果

(1) \varphi 是 ACL 的,也就是线段上绝对连续,absolutely continuous on lines.

(2) | \frac{\partial \varphi}{\partial \overline{z}} | \leq \frac{K-1}{K+1} |\frac{\partial \varphi}{\partial z}| 几乎处处成立。

拟共形映射的一些性质:假设 \varphi 是 K-拟共形映射,K\geq 1.

(i) \varphi 几乎处处可微。对几乎所有的 z_{0}\in \Omega

\varphi(z) = \varphi(z_{0}) + \frac{\partial \varphi}{\partial z}(z_{0})(z-z_{0}) + \frac{\partial \varphi}{\partial \overline{z}}(z_{0})\overline{(z-z_{0})}+ o(|z-z_{0}|).

| \frac{\partial \varphi}{\partial z}|>0 几乎处处成立。

定义 \varphi 的复特征是 \mu_{\varphi}= \frac{\partial \varphi}{\partial \overline{z}} / \frac{\partial \varphi}{\partial z},||\mu_{\varphi}||_{\infty} \leq \frac{K-1}{K+1} <1.

(ii) Measurable Riemann Mapping Theorem ( Ahlfors-Bers )

 

Assume f_{a}(x)=ax(1-x), a_{0} \in (0,4]

Comb(a_{0})=\{ a\in(0,4]: K(f_{a})=K(f_{a_{0}}) \},

Top(a_{0})= \{ a\in (0,4]: f_{a} \text{ and } f_{a_{0}} \text{ are topological conjugate } \},

\Rightarrow Top(a_{0}) \subseteq Comb(a_{0}).

Qc(a_{0}) = \{ a\in (0,4]: f_{a} \text{ and } f_{a_{0}} \text{ are quasi-conformal conjugate} \},

Aff(a_{0}) = \{ a\in (0,4]: f_{a} \text{ and } f_{a_{0}} \text{ are linear conjugate} \},

\Rightarrow Aff(a_{0}) \subseteq Qc(a_{0}).

刚性问题:Comb(a_{0})=Qc(a_{0}) ? Comb(a_{0})=Aff(a_{0})?

 

定理:( Graczyk – Swiatek, Lyubich, 1997) 假设 f_{a_{0}} 没有双曲吸引或者中性周期轨,则 Comb(a_{0})=Qc(a_{0}).

推论:( Sullivan, 1988) Axiom A 系统在实系数二次多项式中稠密。

1.一维动力系统中的双曲性

定理: 假设f:[0,1]\rightarrow [0,1] C^{k}, k是正整数,则存在C^{k} 函数 f_{n}:[0,1]\rightarrow [0,1] 使得 || f_{n}- f ||_{C^{k}}=\max_{x\in[0,1]} \max_{0\leq m\leq k} |D^{m}f_{n}(x)-D^{m}f(x)| \rightarrow 0 as k\rightarrow \infty, 这里的每个f_{n}都满足Axiom A。

 

假设X是紧致度量空间,f:X\rightarrow X是连续函数。如果n是使得f^{n}(x)=x的最小正整数,则称x是以n为周期的周期点。

定义:

\omega(x)=\{ y\in X: \exists n_{k} \rightarrow \infty, f^{n_{k}}(x)\rightarrow y\}.

正向不变集:f(A)\subseteq A,

反向不变集:f^{-1}(A)\subseteq A,

完全不变集:f^{-1}(A)=A i.e. f(A)\subseteq A and f^{-1}(A)\subseteq A.

假设X=[0,1], f(x)\in C^{1}[0,1], \{ x, f(x), ... , f^{n-1}(x)\} 是以n为周期的周期轨道, 定义乘子(multiplier) \lambda=Df^{n}(x)=Df(x)\cdot Df(f(x)) ... Df(f^{n-1}(x))

|\lambda| \neq 1称为orb(x)=\{ f^{k}(x): k=0,1,2... \}是双曲周期轨。

|\lambda|=1称为中性周期轨。

|\lambda|<1称为双曲吸引轨。

|\lambda|>1称为双曲斥性轨。

双曲集合(hyperbolic set):假设f:[0,1]\rightarrow [0,1]C^{1}映射,A是紧集并且f(A)\subseteq A。如果存在C>0, \lambda>1使得对任意的x\in A, n\geq 1, 有|Df^{n}(x)| \geq C\lambda^{n},则称A是双曲集。

Axiom A: 假设 f:[0,1]\rightarrow [0,1]C^{1} 映射,称 f 满足 Axiom A是指:

(1)f 有有限多个双曲吸引轨 \theta_{1},...,\theta_{m},

(2)B(\theta_{i}) 是双曲吸引轨 \theta_{i} 的吸引区域, \Omega=[0,1]\setminus \cup_{i=1}^{m}B(\theta_{i}) 是双曲集。

例子1:f(x)=-x^{2},1是双曲斥性不动点,0是双曲吸引不动点。B(\{0\})=(-1,1), \Omega=[-1,1]\setminus B(\{0\})=\{-1,1\}. f^{n}(x)=x^{2^{n}} , Df^{n}(x)=2^{n}x^{2^{n}-1}. 取C=1, \lambda=2.

例子2:f(x)=2x(1-x), f(x)=ax(1-x).

 

性质1: 双曲斥性周期轨一定是双曲集。

性质2: 双曲集中没有临界点。

性质3: 双曲集合中任何一个周期轨都是双曲斥性的。

 

命题:假设f:[0,1]\rightarrow [0,1]属于C^{1+\alpha}并且\alpha \in(0,1). i.e. Df(x)\alpha-Holder连续的,|Df(x)-Df(y)|\leq C|x-y|^{\alpha}.如果A是双曲集,则A的Lebesgue测度是零。

证明:

 

 

定理(Mane,1985)(CMP)
假设 f:[0,1]\rightarrow [0,1] 是一个 C^{2} 的映射,

(1) f 的所有周期轨都是双曲的。

(2) Crit(f) 指的是 f 的临界点。\forall c\in Crit(f), 则存在双曲吸引周期轨 \theta_{c} 使得 d(f^{n}(c),\theta_{c})\rightarrow 0, n\rightarrow \infty.

\Longleftrightarrow f 满足 Axiom A。

另外一种形式:

假设f:[0,1]\rightarrow [0,1]是一个C^{2}的映射,

U\subseteq Crit(f)\cup \text{ hyperbolic attracting orbits }\cup \text{ and neutral orbits } ,

\Lambda_{U} = \{ x\in[0,1]: f^{n}(x)\notin U, \forall n\geq 0 \},

\Rightarrow \Lambda_{U} 是双曲集。

 

科普文:从人人网看网络科学(Network Science)的X个经典问题

http://blog.renren.com/share/270937572/16694767796?from=0101010202&ref=hotnewsfeed&sfet=102&fin=3&fid=24297752024&ff_id=270937572&platform=0&expose_time=1386225841

科普文:从人人网看网络科学(Network Science)的X个经典问题作者: 邓岳

    长文,写了N个小时写完的。你肯定能看懂,所以希望你能看完,没看完就分享/点赞没有意义。有图有超链接,不建议用手机看。相关内容我想应该可以弄成一个小项目加到某门课中。

网络科学是这两年非常热门的研究方向,具体的研究方向、问题也很多。本文用人人网举几个简单例子,粗浅的说明一下网络科学中的一些经典问题。

社交网络(社会网络)是典型的的复杂网络,有小世界、模块化、无标度等特性。网络中节点(node)代表人,连边(edge/link)代表两个人是好友关系。下图是示意图,红点的意思后面说。

1、链路预测(Link Prediction)

链路预测是预测网络中“不存在但应该存在的边”或“现在不存在但以后可能存在的边”。对应到人人网,前者是说俩人实际上认识,但在人人里没加好友(人人数据相比真实来说缺失了一部分)。后者说俩人现在不认识,但有成为朋友的潜质(比如有共同兴趣等)。本文以前者为例进行说明,对应人人里的好友推荐

人人的好友推荐大体会给你推荐和你的共同好友多的人。潜台词:两人的共同好友数越多,他们认识的可能性也越高。

 (1)最简单的指标(Common Neighbors,CN)

某人的好友即为其在网络中的neighbors(有边相连的node),共同好友即为两人好友的交集。

CN为两人共同好友的个数,直观感觉CN越大,此二人是好友的可能性越大。

CN(x,y)=|N(x)∩N(y)|        N(x)为节点x的所有邻居(计算原因也可以加上x自己)

直观感觉基本没有问题,但CN会让好友多的人“占便宜”。举一个例子:你的好友A和你在人人里还没加好友,A不太玩人人,在人人里只有2个好友,这两个人你都认识,你和A的CN=2。而你的辅导员和你的CN=50(你班里很多同学都和辅导员是人人好友)。如果人人按CN从高到低推荐,那很可能推荐辅导员,很可能不会给你推荐A。

  (2)改进指标(Jaccard index)

从上得知,好友多的人沾光,应进行修正(惩罚),Jaccard系数定义如下:

Jaccard(x,y)=|N(x)∩N(y)| / |N(x)∪N(y)|

当然,改进方法还有很多,大体都是惩罚度数高的节点。比如如除以 k(x)+k(y) 或 k(x)*k(y) 等。k(x)为网络中节点的度,相当于邻居个数。

你会发现人人也会给你推荐只有两三个共同好友的朋友,就是这个原因。

代码如何实现?

数据结构里学过图(graph),图就是网络(network)。图的存储可以有邻接矩阵、邻接表等。

邻接矩阵:方阵,行和列都是网络里所有节点,矩阵元素0代表两节点有连边,1代表无连边。

邻接表:每行为一个节点,后面跟链表,链起来它的所有邻居。适合存储稀疏网络。

先说邻接矩阵,节点x所在行/列所有为1的元素对应的列/行即为x的好友(邻居),同理求y的好友,即可进行计算。

邻接表更简单,x所在行的链表就是x的好友。

 我刚注册人人,一个好友都没有,何谈共同好友?

这是链路预测的一个经典挑战:冷启动(Cold Start)。即对新用户(信息很少)的推荐。

人人网当然不会在共同好友一棵树上吊死,对于新用户,它会根据用户填写的资料进行推荐。如推荐和你同年入学同所学校,或同年龄家乡在相同城市的人等。

    这里还有问题大家可以考虑:推荐的都是目前和你没加好友的人,但整个人人里和你不是好友的有几千万人,总不能给这些人全都和你算一个共同好友数目,然后排序推荐。如何能圈定一个大致的范围?

==================================

2、社团发现(Community Detection)

社团发现是网络科学里另一个火爆问题。社团通常指网络中比较稠密(dense)的部分,就是连边紧密的几个人组成的一个小团体。社团具有“内紧外松”的性质,即社团内部连边稠密,而和社团外部的点连边相对稀疏。

比如你和宿舍的7个人,一共8个人,每两个人都加了人人好友。在网络里就是8个两两相连的点,构成一个大小为8的完全图(complete graph/clique,图中任两点间都有连边)。

假设有N个点,若最极端情况两两相连构成完全图,共有 N(N-1)/2条边。这是N个点之间边数的上线。

可以用简单的方法衡量网络中的一个子图(M条边,N个节点)的稠密程度:dense=M/[N(N-1)/2],显然dense∈[0,1]。

说着简单,给你一个网络看着好像也简单,上图左边三个红点之间就明显有一个社团,但找着可就难了。社团发现的方法流派很多,我之前发的一篇日志列了主流的一些方法。

在社团发现中,还有一大类方法是处理重叠社团(overlapping community)。所谓重叠社团,指节点可以属于多于一个的社团。比如考虑你的好友,高中同学可以聚成一个社团,本科同学可以聚成一个社团,而你既属于高中同学的社团,也属于本科同学的社团。

需要注意的是:在网络中(尤其是生物网络),一个社团/模块未必会表现出连接稠密的性质。

==================================

3、中心性(Centrality)

复杂网络研究中的“中心性”包括:度中心性(Degree Centrality)、介数中心性(Betweenness Centrality)、接近中心性 (Closeness Centrality) 等多种度量方式。

先简单介绍一下度中心性

度中心性就是节点的度(节点邻居个数),度数高的节点一般叫做Hub节点。

度中心性说明了什么?

人人网络中度数高的节点就是好友多,微博网络中度数高的节点就是粉丝多。Hub节点对于网络信息扩散有很大帮助。

度中心性有什么应用?

我想在微博上打广告,只要在李开复、留几手这种Hub节点上投放广告,很快很多用户就能看到。

对于传染病传播来说,Hub节点因为能接触到多人,因此一旦得病很容易传染给周边人群。

对于铁路网络(节点是火车站,边表示两火车站间存在铁路),像徐州、郑州这种交通枢纽城市即为Hub节点。有意识的在全国不同地区设置Hub节点,可以优化车次中转。

然后稍详细说一下介数中心性

介数中心性用来衡量网络中节点和边的重要性,和最短路径紧密相关。

节点s的介数中心性 = 网络中任意两点间的最短路径中通过s的最短路径条数 / 网络中任意两点间的最短路径数

边e的介数中心性 = 网络中任意两点间的最短路径中通过e的最短路径条数 / 网络中任意两点间的最短路径数

对于连通网络,上式分母即为 N(N-1)/2。分子是看这么多条最短路径中,有多少条经过s。

显然 节点的介数中心性∈[0,1]。对于边缘节点(叶子),介数中心性为0;对于星型网络的中央节点,因为所有的最短路径都经过它,所以它的介数中心性为1。

介数中心性说明了什么?

介数中心性高的节点/边一般处于网络的“交通要道”,起到信息传输桥梁的作用,通常处于两个社团之间。上图中红色的点都是介数中心性较高的点。

介数中心性有什么应用?

对于网络攻击而言,希望击毁尽可能少的节点就让网络瘫痪,则可以选择介数中心性高的节点进行攻击。若把上图中6个介数中心性高的节点移除(同时删除红点连接的边),网络就会被分割成多个碎片。

对于上面说的社团发现而言,也可以通过移除介数中心性高的节点,让网络自然分成内紧外松的几部分。但这样做会有一个问题:移除的点到底属于哪个社团?大家可以自己考虑。

介数中心性怎么算?

先用Floyd-Warshall算法计算网络中任意两点的最短路径算法,作为分母。分子就是通过某个顶点或某条边的最短路径。但是,Floyd算法就复杂度O(节点数的3次方),对于大网络效率太低(社会网络甚至可以得到千万甚至亿级节点)。

经典的Brandes算法对于无权图(认为每条边的长度均为1)的复杂度可以到O(节点数*边数)。

最后简单说说接近中心性

接近中心性的基本思想:如果一个人能很容易的联系到其他人,那么TA就是中心的。即TA到其他所以人的距离比较短。在定义时采用网络中两点间最短距离的概念(Dijkstra算法没忘吧),定义如下:

节点i的接近中心性 = ( n-1 ) / ( i到网络中其他点的最短距离的和 )。其中n为网络中节点个数,n-1即为除了i之外的总节点数目。

接近中心性的应用在后面会说。

==================================

4、复杂网络的一些拓扑特性

 (1)小世界(Small World)

常听到的一个概念,社交网络中的“六度分隔”原理(Six Degree Separation)就是小世界现象的一个表现。其核心是网络中任意两点的平均最短距离低。

从人人网也能看出,它是很稀疏的网络,整个网络几千万个节点,每个节点的邻居一般就是几十到几百个。任意选出两个节点,它们之间有连边的可能性很低,但它们的最短路径一般很短(从Facebook数据来看,网络中任意两点的平均最短路径约为4)。

那么,如何能把一个规则网络变成满足小世界特性的网络呢?可以通过随机重连边的方式,看下图(a),原版的网络(深灰色边)是一个规则网络,每个节点的度均为4(连接它左右的4个节点)。通过随机把一些边重连(黑色边),会产生一些“捷径”(short cut),能迅速降低网络中各点的平均最短距离。

 (2)无标度(Scale-free)

整个人人网里,如果把每个节点的度(好友个数)做一个统计,会是什么样的分布呢?

可以想象,度数高的节点是很少的,度数低的节点占绝大多数

无标度网络各节点之间的连接状况(度数)具有严重的不均匀分布性:网络中少数Hub节点拥有极其多的连接,而大多数节点只有很少量的连接。具体来说,节点的度分布符合幂率分布(power law)。所谓无标度,指的是少数高度数的节点的度数非常高,高到爆表。

那么,如何能生成一个无标度网络?经典的BA模型是这样说的:当有新的点加入到网络中时,它会根据概率和网络中的其他点相连,新的点X和老的点Y相连的概率正比于Y的度,即新的节点更倾向于连接到原本度数高的节点(Hub),也称为“优先连接(preferential attachment)”特性,或者叫“富者更富(rich get richer)”或者“马太效应(Matthew effect)”。通过反复添加新节点,就能构建出一个无标度网络。

这个用人人举例不是非常形象,我换用微博举例。一个新注册的微博用户会关注哪些微博呢?有很大可能是关注那些已成名的“大V”(手哥、李开复……)。“大V”即为Hub节点,新节点偏向于和这些Hub相连。如果没有突发情况,“大V”的粉丝增加速度始终会快过普通用户,即“富者更富”。

幂率分布是个复杂网络研究中的大神级概念,感兴趣的同学务必自己多查资料。这里举个简单例子:你一天要发很多短信,我们考虑一下两条短信之间的相隔时间。首先,间隔肯定不是固定的(比如你每小时发一条短信),显然是随机的。那这个随机的时间是否符合某种概率分布?直观想可能是高斯分布(钟形),大多数短信间隔时间差不多(比如每小时左右发一条),但少数短信间隔时间很长(半天也没发一条)或很短(1小时发了3条)。但实际上,发短信存在着“爆发”现象,某些情况下你可能10分钟就发了10条(联系其他同学吃饭,或者通知),而有时可能10小时也没有发一条(睡觉)。而这种爆发在高斯分布中是不可能出现的,感兴趣的同学可看巴拉巴西的《爆发》

 (3)Giant Component

一个规模很大的网络中,可能存在多于一个的连通分支/连通分量(Connected Component),即网络中并非所有节点都连通。但是在真实网络中通常存在一个规模很大的连通分支(即Giant Component)会包含网络中大多数的节点(比如超过80%的节点)。

如果一个人刚注册人人,还没加好友,那他就是一个孤立点(没有和任何其他点相连)。而一旦有了好友,就不再是孤立点。如果某个学校规定学生在人人只能加本校好友,不能有外校好友,那么这个学校的学生也会构成一个小的独立的连通分支。不过可以想到,人人里绝大多数的同学都处在一个超大的连通分支中,真正孤立的节点或小的连通分支所占用户数是很少的。

Facebook的数据显示约99.7%的用户处在一个超大的Giant Component中。

==================================

5、高影响力节点(high influential nodes)/传播(spread)

识别网络中的高影响力节点(leader)有什么用?拿信息传播来说,如果你希望你的信息能迅速在网络中传播,那你就要考虑选择通过高影响力节点来传播信息。

早期的研究结果认为网络中的高度数(Hub节点)或高介数的节点是传播中最有影响力的节点,这是因为Hubs节点拥有更多的人际关系,而高介数的节点有更多的最短路径通过。

高度数节点的例子:你把一卡通丢了,在人人发了一个寻物状态,然后@了一些公共主页(西电小喇叭、西电睿思、在西电……)。因为你觉得这些公共主页因为好友比较多(degree高,属于网络中的Hub),能更有效地把你的信息扩散出去。

高介数节点的例子:你班里的同学A和计算机学院的同学B是情侣。你的班级想和计算机学院合办一个讲座,班长发状态圈了A,A分享给B,B的分享就让很多计算机学院的同学看到了这个信息。尽管A、B两人的好友(degree)未必很高,但A、B算是软件学院、计算机学院两个社团之间的“桥梁”(请回看2中的图中的红点),介数中心性比较高(请回看3),也在信息传播中扮演重要角色。

上面两个例子都很好理解,很符合直观思维,但说的就是对的么?

从直观来想,对于一个网络,应该是靠近网络“中心”的节点具有较高的传播能力。如上图中间红圈中的4个点,可以很迅速的把信息传播到绿圈和蓝圈。

对于上图中的黄色点,虽然其度数很高,但由于并不处于网络“中心”,传播能力显然不如红圈中的4个点。例如:黄色点要把信息传给最下方的红点(位于蓝圈),就需要先传到红圈中,再向下传到绿圈中的红点,最后才能传到蓝圈。

上例和之前认为”高度数节点的影响力高“存在着明显矛盾。“度数高的点影响力高”好理解且好计算,因为网络中一个节点的度数,就是其邻居个数,不管是用邻接矩阵还是邻接表存储,都很容易求得。但“靠近中心”这是个很模糊的说法,和网络的布局(layout)有关系。如果一个网络已经画出来了,我们一眼就能看出所谓的“中心”在哪里。但如果是采用另一个画法,比如你人为把上图中的蓝色点拖到网络外围,那它也就不在中心了。

说到这里,就产生了一个新问题:如何形式化的(用数学手段)确定网络的“中心”?

直观的看,可以用上面的“接近中心性”来计算。位于网络中心的节点,和其他节点的最短距离都比较小;而位于网络边缘的节点,和很多节点(位于网络另一侧的节点)的最短距离都比较大。但是,接近中心性需要计算网络中任意两点间的最短距离,计算量很大。

也有研究者提出利用图论中的概念“K-Shell”来进行:

1、首先删除网络中所有度为1的节点。删完以后检查,原来某些度大于1的节点会变成度为1的,就继续删,删完再检查,再删……直到没有度为1的节点为止。最终,认为刚才所有删掉的的节点属于第一层,即ks=1的节点(上图外围蓝色圈);

2、现在网络中肯定没有度为1的节点了(都删掉了),那就开始删除度为2的节点(和上一步方法一致)。这次删掉的就是ks=2的节点(上图绿圈);

3、依此类推,接着删度为3的节点,然后是度为4的节点……最终网络删干净了,网络中所有的点都被分配了一个ks值。

实验研究表明,对于单个传播源的情形(你只圈一个公共主页让其帮你转发),Hubs节点或者高介数的节点不一定是最有影响力的节点,而通过K-shell分解分析确定的网络核心节点(K-shell值大的节点)才是最有影响力的节点;对于多传播源情况下(你圈了多个公共主页来帮你转发),传播的规模很大程度依赖于初始传播源之间的距离,此时选择多个Hub节点是比较有效的。

==================================

6、网络的比对(alignment)、去匿名化(de-anonymization)

我们一般不止用一个SNS(比如同时在用QQ和人人),那多个SNS网络存在着部分一致(比如你宿舍8个人在人人上互相加了好友,在QQ上也互相是好友)。网络比对的目的就是寻找两个网络中的节点的对应关系。

如上图(图片来自http://botao.hu/works/research/de-anonymizingsocialnetworks)所示,比如说左边网络是人人,右边是QQ。根据两个网络极其中节点的拓扑特征,我们把两个网络中的节点进行映射(虚线),即虚线相连的两个点,是同一个实体(人)。

通过网络比对,得到两个网络间节点映射,有啥用?

一个应用是“去匿名化”,顾名思义就是搞清楚网络中的匿名用户到底是谁。比如说公安机关发现人人网里一个名为wurht的用户在疯狂造谣(左图左下角翠绿节点),但人人中其注册所用信息是假的。通过网络比对,可以知道人人中的wurht用户对应着QQ中的wurh用户,而通过腾讯可以查到其身份,自然也就知道了人人中wurht的真实身份。

需要注意的是,网络比对可以仅利用拓扑特征,也可整合更多的特征(比如节点的某些个人信息)。比如腾讯的圈子中的“智能备注”功能,会根据一个用户的好友对其的备注,来推断其真实身份(根据腾讯官方说明:如果圈子内有半数以上的好友对您设置相同的QQ备注名,这个备注名将默认成为您在圈子内展现的名称。)。这和标签传播(Label Propagation,用已标记节点的标签信息去预测未标记节点的标签信息)的思想有些类似。

==================================

7、动态网络(Dynamic Networks)/演化(Evolution)

这里的动态网络主要指网络在不同时间点的变化。拿人人网为例,如果你每月的一号把人人网里你好友的数据储存一次(保存你的好友、好友的好友之间的连边关系),连续存上一年,就形成了由12个静态网络组成的动态网络。在这12个静态网络中,节点和边可能都不一样(注意在前面链路预测研究中,节点是不变的)。

我们把这12个静态网络依次来看,看每个月和下一个月之间的差异。可以用前面讲的社团发现的方法,对每个静态网络进行聚类,然后研究相邻两个月之间社团的变化情况。可以定义如下六种相邻时刻间的变化:

(1)Growth:同一个社团,在相邻时刻出现了增长(如节点变多了)。和Growth相对的就是Contraction。

(2)Merging:t时刻的两个社团,在t+1时刻合并成了一个社团。和Merging相对的就是Splitting。

(3)Birth:t+1时刻新出现了一个社团(该社团在t时刻不存在)。和Birth相对的就是Death。

比如在1月1号时,软件学院的小帅(男)和人文学院的小美(女)的两个宿舍,各自构成了一个社团(比如是8个节点的完全图)。但由于这两个宿舍的人互相之间并不认识,所以两个社团间的连边很少。在1月中旬,小帅和小美成为了情侣,两个宿舍的人也开始互相认识,大部分同学之间也加了人人好友。在2月1号,由于这两个宿舍原先构成的社团之间多了很多连边,这两个社团就合成了一个大的社团(16个节点的近似完全图)。这就是上图的Merging。又过了几个月,俩人又分手了,两个宿舍的人也纷纷取消删了之间的好友关系,一个大社团又回到了最早的两个社团,即为Splitting。

 

【未完待续】

==================================

网络科学自学资料

一、书

1、《网络科学导论》《链路预测》《网络科学》 、《Network Science》 入门且全面,正统的Network Science

2、《推荐系统实践》《推荐系统》 主流的Web应用里都有推荐系统,算是网络科学的主要应用方向

3、《网络、群体与市场》 也是入门书,结合经济学、社会学、计算与信息科学以及应用数学的有关概念与方法,考察网络行为原理及其效应。

4、《链接》 巴拉巴西早期经典著作

二、公开课

1、Social Network Analysis 2013.3开课时我全程跟下来并拿到了成绩。2013.10再次开课。强烈建议英文差不多的10级不考研同学10月跟一下这个。

2、网络、群体与市场 中文公开课,在Coursera也有。对软件方向的同学,建议重点看一下课程的第2、3、9、11、13~17章。

3、Social and Economic Networks: Models and Analysis,斯坦福,2014.1开课(来自文后留言)