MathGroup Archive 2006

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Reduction of Radicals

  • To: mathgroup at
  • Subject: [mg71984] Re: [mg71902] Reduction of Radicals
  • From: Daniel Lichtblau <danl at>
  • Date: Thu, 7 Dec 2006 06:25:28 -0500 (EST)
  • References: <>

dimitris wrote:
> Based on this reference
> Cardan Polynomials and the Reduction of Radicals (by T. Osler)
> (see also references therein)
> (you can download the paper here:
> )
> the following expression can be reduced to 1
> z = (2 + Sqrt[5])^(1/3) + (2 - Sqrt[5])^(1/3)
> Mathematica gives
> N[%]
> 1.9270509831248424 + 0.535233134659635*I
> This is because by default it returns a complex number for the cube
> root of a negative number
> List @@ z
> N[%]
> {(2 - Sqrt[5])^(1/3), (2 + Sqrt[5])^(1/3)}
> {0.30901699437494756 + 0.535233134659635*I, 1.618033988749895}
> However defining
> mycuberoot[x_] := Block[{w}, w = w /. Solve[w^3 == 1][[3]]; If[Re[x] <
> 0, w*x^(1/3), x^(1/3)]]
> Then
> {2 - Sqrt[5], 2 + Sqrt[5]}
> mycuberoot /@ %
> FullSimplify[%]
> Together[Plus @@ %]
> {2 - Sqrt[5], 2 + Sqrt[5]}
> {(-1)^(2/3)*(2 - Sqrt[5])^(1/3), (2 + Sqrt[5])^(1/3)}
> {(1/2)*(1 - Sqrt[5]), (1/2)*(1 + Sqrt[5])}
> 1
> Is there a particular reason why by default Mathematicas returns a
> complex number for the cube root of a negative number or it is a matter
> of choise?
> Following the same procedure I prove that
> (10 + 6*Sqrt[3])^(1/3) + (10 - 6*Sqrt[3])^(1/3)
> is equal to 2. Indeed
> {10 + 6*Sqrt[3], 10 - 6*Sqrt[3]}
> mycuberoot /@ %
> FullSimplify[%]
> Together[Plus @@ %]
> {10 + 6*Sqrt[3], 10 - 6*Sqrt[3]}
> {(10 + 6*Sqrt[3])^(1/3), (-1)^(2/3)*(10 - 6*Sqrt[3])^(1/3)}
> {1 + Sqrt[3], 1 - Sqrt[3]}
> 2
> This behavior of Mathematica does not affect simplifications by e.g.
> RootReduce?
> I must admit that I have gaps on my knowledge in these symbolic aspects
> (I start to be interested in after I try to solve the the secular
> Rayleigh equation)
> so more experienced members of the forum may forgive any possible
> mistakes of mine!
> Anyway I don't understand this difference in treating nested radicals
> between literature and    Mathematica.
> I really appreciate any kind of insight/guideness/comments.
> Regards
> Dimitris

The use of principal roots has received some replies but since I liked 
the Osler article above and wanted to comment I thought I'd revisit.

(1) As pointed out by others (A. Kozlowski, M. Eisenberg) use of 
principal values for fractional roots means, among other things, that 
Power can be defined in terms of Log, and they can share a branch cut. 
This is useful in and of itself (try figuring out jumps in definite 
integration with a proliferation of functions having unrelated btranch 

Another reason to like the definition a^b==Exp[b*Log[a]] is that for r>0 
it makes f[x_]=(-r)^x differentiable in x. With a choice of negative 
roots for x equal to 1/n, n an odd integer, such a function would fail 
even to be continuous.

Another nice feature is that it becomes simple to recover "surds" (that 
is, the full set of values for a radical a^(1/n)) simply by taking the 
principal value and multiplying by powers of the principal nth root of 
unity. Were we to have (-1)^(1/3), say, be simply -1, then one would be 
forced to use explicit complex exponentials instead of root-of-unity 
radicals in order to attain the principal value. But having (-1)^(1/3) 
be the principal value means we can easily attain other roots such as -1 
by multiplying by appropriate powers of this root of unity.

(2) Osler's paper discusses some ways to reduce certain radicals to 
simpler forms. This is a special case of radical denesting. In this case 
one can use polynomial algebra techniques coupled with a selection 
procedure to remove parasite roots.

One example uses something resembling

(2+Sqrt[5])^(1/3) + (2-Sqrt[5])^(1/3)

EXCEPT with the convention that the cube of the negative is a negative 
rather than principal value. To find the desired value one might make 
new variables for radicals and polynomials to define them (in effect 
giving the surds, or full sets of values), eliminate all variables other 
than the one representing the value of interest, and then find the root 
of the resulting polynomial (in that remaining variable) that lies in 
the region of interest. For this example we might let


with defining polynomial

x^3-(2+z) (where y is given by z^2-5). Letting t be the value of 
interest and continuing in this way, we would get

polys = {t-(x+y), x^3-(2+z), y^3-(2-z), z^2-5};

Now form a Groebner basis eliminating all but t, and solve for t.

roots = t /. Solve[First[GroebnerBasis[polys,t,{x,y,z}]]==0, t];

Last, select the root that is real valued.

In[36]:= Select[roots, Element[#,Reals]&]
Out[36]= {1}

Of course we could use direct built in functionality, provided we first 
translate the expression as per note (1) above so that we are indeed 
getting the negative root for the second summand. This summand thus 
becomes (-1)^(2/3)*(2-Sqrt[5])^(1/3)

and we do

In[37]:= RootReduce[(2+Sqrt[5])^(1/3) + (-1)^(2/3)*(2-Sqrt[5])^(1/3)]
Out[37]= 1

The last example in the paper is a bit more complicated but can be 
handled in exactly the same ways. It was from the dedication of a 1997 
paper, commerating an anniversary of the birth of Ramanujan.

const = (32*(146410001/48400)^3 - 6*(146410001/48400));
polys = {t^6-(const+b), b^2-(const^2-1)};
roots = t /. Solve[First[GroebnerBasis[polys,t,b]]==0, t];

Now grab any root real and larger than 1.

In[41]:= Select[roots, Element[#,Reals]&&#>1&]
Out[41]= {110}

(3) Osler defines and uses "Cardan polynomials" to do radical reduction. 
  I cannot help but notice* that these have interesting combinatorial, 
algebraic, and analytic properties, a few of which I'll describe. First 
his definition:

Ca[n_,x_,y_] := Expand[2*y^(n/2)*ChebyshevT[n,x/(2*Sqrt[y])]]

For example:

In[45]:= InputForm[Ca[9,x,y]]
Out[45]= x^9 - 9*x^7*y + 27*x^5*y^2 - 30*x^3*y^3 + 9*x*y^4

We'll work with a closely related family wherein we take absolute values 
of coefficients and also add a term y^n.

Da[n_,x_,y_] := Expand[-I^n*Ca[n,I*x,y]+y^n]

In[47]:= InputForm[Da[9,x,y]]
Out[47]= x^9 + 9*x^7*y + 27*x^5*y^2 + 30*x^3*y^3 + 9*x*y^4 + y^9

Finally we'll want to define some very familiar polynomials.

Ru[n_,x_,y_] := Expand[(x+y)^n]

In[49]:= InputForm[Ru[9,x,y]]
x^9 + 9*x^8*y + 36*x^7*y^2 + 84*x^6*y^3 + 126*x^5*y^4 + 126*x^4*y^5 +
  84*x^3*y^6 + 36*x^2*y^7 + 9*x*y^8 + y^9

Note that Ru[n,x,y] has coefficients from the nth row of the Pascal 

(A) If one looks at rows of coefficients from Da[n,x,y] as n increases 
one sees an assymmetric triangle. For example, the 11th row would be

1   11  44  77  55  11   1

We can derive this from the 11th row of the Pascal triangle using a 
one-sided differencing. That is, for the kth element of the mth row, 
we'll take absolute value of alternating sums from the prior row, up to 
the kth element.

Pa[n_,k_,0] := Binomial[n,k]
Pa[n_,k_,m_] := Abs[Sum[(-1)^(j-1)*Pa[n,j,m-1],{j,m,k}]]

Now let's look at a table of these values, for n=11.

assymmetrictable[n_] := Table[Pa[n,k,j], {j,0,(n-1)/2}, {k,0,n-1}]

In[146]:= InputForm[assymmetrictable[11]]
{{1, 11, 55, 165, 330, 462, 462, 330, 165, 55, 11},
  {0, 11, 44, 121, 209, 253, 209, 121, 44, 11, 0},
  {0, 0, 44, 77, 132, 121, 88, 33, 11, 0, 0},
  {0, 0, 0, 77, 55, 66, 22, 11, 0, 0, 0},
  {0, 0, 0, 0, 55, 11, 11, 0, 0, 0, 0},
  {0, 0, 0, 0, 0, 11, 0, 0, 0, 0, 0}}

We have recovered the nontrivial "middle" coefficients for Da[11,m] (the 
two "end" coefficients are simply unity) from the first nonzero element 
of row m: 11, 44, 77, 55, 11.

There are nice combinatorial formulas for these coefficients; I wanted 
to illustrate an algorithmic approach to recovering them from the 
binomial coefficients of the Pascal triangle.

(B) The maps Ru[n,x,y] and Da[n,x,y] each posess a group invariance 
property. And, returning to the original intent of the thread, thses 
involve multiplication by principal roots of unity. For Ru[n,x,y] we 
have the map given by the matrix rumat = {{e,0},{0,e}} where 
e=(-1)^(2/n) is the principal nth root of unity.

This gives rise to a group action {z,w} --> {e*z,e*w}.

For Da[n,x,y] we use damat = {{e,0},{0,e^2}}

giving the group action {{z,w} --> {e*z,e^2*w}. So let's define these 
actions explicitly.

groupAction[n_,m_,expr_,x_,y_] := expr /.

Now we'll check the claimed invariance properties for n=9.

In[156]:= InputForm[groupAction[9,1,Ru[9,x,y],x,y]]
x^9 + 9*x^8*y + 36*x^7*y^2 + 84*x^6*y^3 + 126*x^5*y^4 + 126*x^4*y^5 +
  84*x^3*y^6 + 36*x^2*y^7 + 9*x*y^8 + y^9

This is indeed simp,ly Ru[9,x,y]

In[157]:= InputForm[groupAction[9,2,Da[9,x,y],x,y]]
Out[157] x^9 + 9*x^7*y + 27*x^5*y^2 + 30*x^3*y^3 + 9*x*y^4 + y^9

which again is just Da[9,x,y]

(C) It turns out that both Ru[n,x,y] and Da[n,x,y] map the lines x+y=1 
to 1. That is, if we replace y by 1-x the polynomials will evaluate to 

For example:

In[161]:= Ru[9,x,1-x]
Out[161]= 1

In[162]:= Da[9,x,1-x]
Out[162]= 1

The case of Ru[n,x,1-x] should not be a surprise. After all Ru[n,x,y] is 
simply (x+y)^n and this is of course 1 on x+y=1. That Da[n,x,1-x]=1 is a 
bit more subtle.

Also: all polynomial maps with this property that are invariant under 
the action of groupAction[n,1,...] can be obtained from straightforward 
operations that amount to "tensoring", and inversion thereof, of this 
basic polynomial Ru[n,x,y]. Similarly, all polynomial maps that take 
x+y==1 to 1 and are invariant under groupAction[n,2,...] are obtained 
from such operations on Da[n,x,y].

Clearly (he said,) there are no other 2x2 finite matrix group 
representations for which there are invariant polynomials taking x+y=1 
to 1 (except for those obviously equivalent to damat, as these can each 
be represented in two ways).

Anyone else catch these?*

Daniel Lichtblau
Wolfram Research

*Just trolling. This stuff is far from obvious but was familiar from a 
previous lifetime. "Pa" is for Pascal, "Ru" for Rudin, "Da" for 
D'Angelo. Obviously Rudin's polynomials predate Rudin; his contribution 
was to note that they both map the unit ball in C^2 to higher 
dimensional balls (analogous to those polynomials taking x+y-1 to 1), 
and satisfy a group invariance property.

As noted above, the Cardan polynomials from Osler's article are 
D'Angelo's but with one term dropped and signs alternating. For odd n 
D'Angelo had defined them in an article from 1988, showing how they give 
maps from the unit ball in C^2 with properties similar to Rudin's, but 
working with a less trivial group. He also conjectured no other such 
groups could be used in this way, either for C^2 or higher dimension 
domains. Some possible matrix groups had been ruled out around that time 
by F. Forstneric. Ruling out the rest was interesting work, and I think 
it's all covered in D'Angelo's book "Several Complex Variables and the 
Geometry of Real Hypersurfaces". The cases of n even-valued were used in 
an article by D'Angelo from the 90's to discuss mappings from the ball 
to hyperquadrics that have similar invariance properties. We do not know 
if these polynomials were defined in any context prior to that. Various 
combinatorial properties of the D'Angelo polynomials appear in a number 
of his articles. The connection of these to Chebyshev polynomials is 
observed in an American Mathematical Monthly article by Dilcher and 
Stolarsky (October 2005). That all polynomial maps invariant under that 
groupAction[n,2,...] and taking x+y=1 to 1 can be obtained from 
Da[n,x,y] was proved in a hammock.

  • Prev by Date: Re: Points sampled by N[Derivative[]]
  • Next by Date: Re: RE: Re: RE: RE: Functional decomposition (solving f[f[x]] = g[x] for given g)
  • Previous by thread: Re: Re: Re: Reduction of Radicals
  • Next by thread: Re: Reduction of Radicals